npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

pilotswarm-horizon-store

v0.3.0

Published

HorizonDB-backed enhanced facts and graph providers for PilotSwarm.

Readme

pilotswarm-horizon-store

Optional HorizonDB-backed providers for PilotSwarm enhanced facts/search and the open knowledge graph. The package is published alongside pilotswarm-sdk and is intended for apps that keep stock PostgreSQL as the PilotSwarm runtime store while opting into HorizonDB for knowledge retrieval and graph-backed harvester workflows.

What this is

An optional, enhanced read interface over the PilotSwarm Facts Store, built exclusively on Azure HorizonDB (preview) capabilities:

| HorizonDB capability | Role here | | --- | --- | | pg_textsearch | Ranked lexical recall (upgrade over today's LIKE / key_pattern) | | HTTP embedding endpoint | In-DB embedding generation via a configurable endpoint + semantic (vector ANN) recall | | Apache AGE | Relationship/lineage graph overlay (structure only) | | pg_durable | Durable, idle-aware maintenance pipeline (embeddings only) |

Embeddings come from a configurable HTTP endpoint, not HorizonDB's built-in aiModelManagement. You pass an OpenAI/Azure-OpenAI-compatible embeddings endpoint to the provider; the pg_durable loop calls it over HTTP from inside the database (sql/006), and a Node fallback (embedPending()) covers clusters without the http extension. See CRAWLER-SPEC.md §3.

Hard design rules (carried from PilotSwarm)

  1. The facts table stays authoritative. tsvector, vector, and AGE are derived indexes/overlays — never the source of truth. The graph stores ids and structure, never fact values or ACLs.
  2. Governance is unchanged. Scope (scope_key), shared / transient, namespace ACLs, and spawn-tree visibility are still enforced by stored procedures. Search modes are extra AND clauses inside the existing visibility filter — they can only narrow what a caller already sees.
  3. Determinism boundary. Anything LLM/IO (embedding, distillation, relatedness) runs as a pg_durable activity, never inline orchestration.
  4. Rebuildable. Every derived artifact (tsvector, embeddings, AGE graph) can be dropped and rebuilt from facts rows.

Layout

packages/horizon-store/
  SPEC.md            ← the design: data model, compute/API/frequency, scenarios
  CRAWLER.md         ← open, ontology-free LLM graph crawler (entities + free-form relationships)
  CRAWLER-SPEC.md    ← implementation contract: API, schema, compute tiers, PG mailing-list example
  src/
    types.ts          ← EnhancedFactStore + GraphCrawlerInterface contracts + DTOs
    config.ts         ← provider config incl. the embedding HTTP endpoint
    embedding-client.ts ← Node-side embeddings client (query-time + test reference)
    query-builder.ts  ← DB-less hybrid ranking/fusion + SQL fragment builders (unit-tested)
    graph-model.ts    ← DB-less open-graph quality core: canonicalize, predicate, confidence (unit-tested)
    migrations.ts     ← Node-runnable schema + AGE setup (mirrors sql/001–005)
    http-embedding.ts ← Node-runnable in-DB HTTP embedding pipeline (mirrors sql/006)
    horizon-store.ts  ← HorizonFactStore: drop-in EnhancedFactStore + open-graph crawler
    agent-tools.ts    ← optional agent tools (search / related / graph) for injection
    index.ts          ← public exports
  sql/
    001_enrich_facts.sql  ← tsvector + embedding columns + indexes
    002_age_graph.sql     ← AGE node/edge model + structural backfill
    003_search_procs.sql  ← facts_search_facts + related/lineage procs
    004_pipelines.sql     ← pg_durable maintenance pipeline (embeddings only)
    005_open_graph.sql    ← open Entity / REL / EVIDENCED_BY graph for the crawler
    006_embeddings_http.sql ← in-DB HTTP embedding pipeline (replaces aiModelManagement)
  poc/                ← runnable harnesses (lexical/hybrid/crawler run DB-less today)
  test/               ← DB-less unit tests (run in CI without HorizonDB)
    integration/      ← live tests against a real HorizonDB (skip without HORIZON_DATABASE_URL)

Drop-in replacement

HorizonFactStore implements the full PilotSwarm FactStore API (storeFact / readFacts / deleteFact / stats / …) with identical semantics, so it can replace PgFactStore anywhere. It adds retrieval methods (searchFacts / relatedFacts / lineageFacts) and the open-graph crawler (upsertEntity / assertRelationship / …). Apps opt into the extras; nothing in the base behavior changes.

import { HorizonFactStore, createFactsTools } from "pilotswarm-horizon-store";

const store = await HorizonFactStore.create({
  connectionString: process.env.HORIZON_DATABASE_URL,
  embedding: { url: EMBED_URL, model: "text-embedding-3-small", dim: 1536, apiKey: KEY },
});
await store.initialize();

// ...use exactly like the existing FactStore, plus:
await store.searchFacts("jsonb subscripting", { mode: "hybrid" }, { unrestricted: true });

// Optionally inject tools into your agents:
const tools = createFactsTools(store, { graphWrite: true }); // spread into worker.registerTools([...])

Running

cd packages/horizon-store
npm install
npm run build
npm test                 # DB-less unit tests — no HorizonDB needed
npm run poc:crawler      # open-graph harvest scenario — runs DB-less today

# Live integration tests against a real HorizonDB (auto-skip if unset):
export HORIZON_DATABASE_URL=postgres://user:pw@host/db
npm run test:integration # pg_durable HTTP embeddings, AGE Cypher, full provider

Evaluation

The incubator ships a two-axis evaluation surface — system evals (is the harvest correct, durable, fast?) and quality evals (does the graph actually help an LLM answer better than parametric knowledge + live web?). The full system overview — how the harvester builds the KB and how every eval tier fits together — is in docs/harvester-and-eval.md.

| Tier | Where | What it proves | | --- | --- | --- | | Scenario (system) | eval/README.md | Cold/incremental harvest, replay determinism, scoped publication, reader fact-pivot | | Quality (single model) | eval/graph-quality.mjs | Graph arm vs parametric+web baseline, blind-judged against corpus ground truth | | Quality (cross-model sweep) | eval/sweep/ | A 3×3×3 harvester × query × judge tensor that isolates judge bias |

Cross-model sweep

The sweep answers "is the graph really better, or just a lucky model / biased judge?" by sweeping three independent axes into a score tensor and generating a bias-aware report grounded in a deterministic numeric summary. Latest run (pgsql-hackers-recent, N=8 questions/cell, 216 graded rows):

graph 4.76 vs baseline 2.42 (Δ +2.34 on a 1–5 scale), graph winning 163/216 head-to-head with no judge able to flip the verdict — and it holds with the graph answers being shorter than the baseline. Honestly deflated to a ~+1.8–2.0 substantive edge once baseline web-timeout failures are separated out (see the report's methodology + caveats).

Specs & docs

Enhanced facts store (this incubator): SPEC.md (base design) · CRAWLER.md / CRAWLER-SPEC.md (open-graph crawler) · GAP-ANALYSIS.md. Upstream provider contract + tool/test specs: docs/proposals/enhancedfactstore/ (01-functional-spec · 02-api-reference · 03-design · 04-test-spec · 05-tools-spec · 06-provider-test-plan).

Harvester & evaluation: docs/harvester-and-eval.md (full system overview) · eval/README.md (scenario tier) · eval/sweep/REPORT.md (latest cross-model sweep).