npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

goldenpipe

v0.2.0

Published

Golden Suite orchestrator — chains GoldenCheck, GoldenFlow, and GoldenMatch into one adaptive pipeline. TypeScript port of the goldenpipe Python library.

Downloads

276

Readme

goldenpipe

Golden Suite orchestrator for TypeScript — chains GoldenCheck → GoldenFlow → GoldenMatch into one adaptive, pluggable pipeline. TypeScript port of the goldenpipe Python library.

It composes the edge-safe cores of the three sibling packages:

Data flows through the pipeline as Row[] (arrays of plain objects).

Install

npm install goldenpipe
# the three siblings come along as dependencies

yaml is an optional peer dependency, needed only for YAML config loading:

npm install yaml

Quick start

import { runDf } from "goldenpipe";

const rows = [
  { first_name: "John", last_name: "Smith", email: "[email protected]" },
  { first_name: "Jon",  last_name: "Smith", email: "[email protected]" },
  { first_name: "Jane", last_name: "Doe",   email: "[email protected]" },
];

// Zero-config: runs goldencheck.scan -> goldenflow.transform -> goldenmatch.dedupe
const result = await runDf(rows);

console.log(result.status);          // "success"
console.log(result.inputRows);       // 3
console.log(result.artifacts.golden); // golden (canonical) records
console.log(result.artifacts.unique); // distinct records

Async: the runner is async because GoldenMatch's dedupe is async. runDf, runStages, Pipeline.run, and the node run(source) all return promises.

From a CSV file (Node)

import { run } from "goldenpipe/node";

const result = await run("people.csv");          // zero-config
const result2 = await run("people.csv", { config: "pipeline.yml" });

Custom pipeline config

import { runDf, makePipelineConfig, makeStageSpec } from "goldenpipe";

const config = makePipelineConfig({
  pipeline: "check-and-dedupe",
  stages: [
    "goldencheck.scan",
    makeStageSpec({ use: "goldenmatch.dedupe", config: { threshold: 0.9 } }),
    // omit goldenflow.transform to skip transformation
  ],
});

const result = await runDf(rows, config);

Programmatic stages

import { runStages, stage, StageStatus } from "goldenpipe";

const myStage = stage(
  { name: "tagger", produces: ["tag"], consumes: ["df"] },
  (ctx) => {
    ctx.artifacts.tag = (ctx.df ?? []).length;
    return { status: StageStatus.SUCCESS };
  },
);

const result = await runStages([myStage], rows);

CLI

goldenpipe-js run people.csv [-c pipeline.yml] [-v]   # run the chain on a CSV
goldenpipe-js stages                                  # list registered stages
goldenpipe-js validate -c pipeline.yml                # dry-run wiring validation
goldenpipe-js init [-d .]                             # scaffold a goldenpipe.yml
goldenpipe-js mcp-serve                               # run the MCP server (stdio)
goldenpipe-js agent-serve [-p 8250]                   # run the A2A agent server (HTTP)
goldenpipe-js serve [-p 8000]                         # run the REST API server (HTTP)

Servers (MCP / A2A / REST)

GoldenPipe ships three server surfaces, each exposing the same 4 operations as the Python sibling — list_stages, validate_pipeline, run_pipeline, explain_pipeline:

  • MCP (stdio, JSON-RPC 2.0): goldenpipe-js mcp-serve or the goldenpipe-mcp bin.
  • A2A (HTTP, port 8250): goldenpipe-js agent-serve — agent card at /.well-known/agent.json, skill dispatch at POST /tasks.
  • REST (HTTP, port 8000): goldenpipe-js serveGET /stages, POST /validate, POST /run.

Wire the MCP server into a client (e.g. Claude Desktop):

{ "mcpServers": { "goldenpipe": { "command": "goldenpipe-mcp" } } }

Architecture

flowchart LR
  L[load] --> C[goldencheck.scan]
  C --> F[goldenflow.transform]
  F --> M[goldenmatch.dedupe]

| Stage | Wraps | Produces | |-------|-------|----------| | load | built-in | df | | goldencheck.scan | scanData(TabularData) | findings, profile, column_contexts | | goldenflow.transform | new TransformEngine(cfg).transformDf(rows) | df, manifest | | goldenmatch.dedupe | await dedupe(rows, { config }) | clusters, golden, unique, dupes, match_stats, scored_pairs |

The engine layer mirrors the Python design:

  • registry — a STATIC registry (buildDefaultRegistry()) replacing Python's entry-point discovery.
  • resolver — builds an ExecutionPlan, auto-prepends load, validates consumes/produces wiring.
  • router — applies a stage's Decision (skip / insert / abort) to the remaining plan.
  • runner — async stage execution with per-stage error handling + skipIf gating.
  • reporter — assembles the PipeResult (status, stages, artifacts, errors, reasoning, timing).

A column-context pipeline carries semantic metadata across stages: GoldenCheck builds ColumnContexts (name-regex classification + IQR cardinality banding + identifier inference), GoldenFlow enriches them (date transforms confirm date type), and GoldenMatch consumes them to build a targeted dedupe config (buildConfigFromContexts) instead of re-profiling.

Decisions (adaptive routing)

severityGate, piiRouter, and rowCountGate are ported. They are not wired into the default chain — add them to a custom runner / stage that returns their Decision.

TS sibling skew: GoldenCheck-JS Finding.severity is a numeric enum (INFO/WARNING/ERROR) with no "critical" level, and there is no "pii_detection" check. So severityGate and piiRouter are effectively no-ops against current GoldenCheck-JS output — they exist for structural parity and so custom stages emitting those findings still route.

Deferred (not in this v1 port)

  • identity_resolve stage — GoldenMatch-JS Identity Graph wiring through the pipeline. The edge-safe InMemoryIdentityStore exists in goldenmatch, but the pipeline-driven resolveClusters population is not yet exposed.
  • infer_schema stage — InferMap-based schema inference is not ported.
  • Textual TUI — the Python Textual TUI is not ported. (The MCP, A2A, and REST servers are ported — see above.)

Sibling version-skew artifacts

The TS siblings are version-skewed from the Python ones, so some artifacts the Python pipeline surfaces are shaped differently or absent here:

  • golden artifact maps to GoldenMatch-JS DedupeResult.goldenRecords (the Python sibling exposes .golden).
  • scored_pairs is GoldenMatch-JS result.scoredPairs (camelCase).
  • matchkey_used is derived from the built config's first matchkey — the JS DedupeResult does not carry the resolved matchkey list back (the Python result does after auto-config).
  • The Python goldencheck.scan adapter calls scan_file(path), so the in-memory run_df path fails that stage. GoldenCheck-JS's scanData operates on rows, so the TS adapter's scan succeeds in both the in-memory (runDf) and file (run) paths.

Cross-language parity

tests/parity/pipe-parity.test.ts asserts skew-robust invariants (status, input_rows, ordered per-stage status/skip sequence, final golden/unique counts) against Python-generated goldens in tests/fixtures/pipe_parity.json. Regenerate the goldens with:

uv run --project packages/python/goldenpipe python \
  packages/python/goldenpipe/scripts/emit_ts_parity_fixtures.py

License

MIT