@nusoft/nuvector

v0.1.5

Published

2 days ago

Local-first memory and retrieval engine of NuOS — vector storage, four-layer hierarchical retrieval, provenance, temporal reasoning, write-intent support

@nusoft/nuvector

The memory and retrieval engine of NuOS — a four-layer hierarchical knowledge index designed to make a million-article wiki searchable in milliseconds and a few hundred tokens.

NuVector is not a generic vector database. It is the retrieval substrate of an operational neural OS: built specifically to be the search layer over a compiled knowledge wiki and a workflow engine's context provider. It assumes structured content (articles, sections, citations, backlinks) and optimises for that shape.

import { NuVector } from "@nusoft/nuvector";

const memory = await NuVector.open({
  storage: "./project.nv",
  dimensions: 768,
  tenant: "school_bridge",
});

await memory.upsert({
  id: "art_001",
  kind: "nuwiki_article_summary",
  embedding,
  text: "Year 5 Oak — morning routine and pastoral notes",
  tenant: "school_bridge",
  metadata: { articleId: "art_001", documentType: "class_profile", version: "v1" },
});

const context = await memory.searchKnowledge({
  query: "morning briefing for Year 5 Oak",
  embedding: queryEmbedding,
  budget: { maxTokens: 2000, maxArticles: 3 },
});

Picking the right retrieval method. searchKnowledge is the NuWiki-shaped four-layer entry point — it always anchors on nuwiki_article_summary (Layer 1) records. If your records are NuWiki articles (with sections, citations, and a graph), this is the right call. If your records are arbitrary chunks (e.g. document_chunk, incident_history, custom kinds), use retrieveContext instead — it respects whatever kind filter you pass and will not filter you down to Layer 1 by surprise.

Install

npm install @nusoft/nuvector

The package is published privately under the @nusoft scope.

The four layers

NuVector maintains four parallel indexes per deployment, each tuned for a different retrieval pattern:

| Layer | Records | Latency target | Use case | |---|---|---|---| | 1. Article summaries | ~1 per article | <10ms | Entry point; ~70% of queries terminate here | | 2. Sections | ~5 per article (with article-summary prefix) | <30ms (<5ms filtered) | Drill-down after a layer-1 hit | | 3. Citations | per-citation, opt-in | <5ms filtered | Precision retrieval for evidence-critical patterns | | 4. Backlink graph | typed edges between articles | <10ms 2-hop | Relationship traversal |

A query for "context about James Smith's recent peer-conflict patterns" with subject = pupil_james_smith AND documentType = peer_conflict_pattern cuts a 50,000-article corpus to 5–10 candidates before vector math runs. Filter pushdown is the secret weapon.

Storage backends

Three backends, one interface:

// In-memory (tests, ephemeral)
await NuVector.open({ storage: "memory:", dimensions: 768 });

// Local file (single-process production)
await NuVector.open({ storage: "./project.nv", dimensions: 768 });

// Postgres (cloud, multi-process)
await NuVector.open({
  storage: { kind: "postgres", connectionString: process.env.DATABASE_URL! },
  dimensions: 768,
});

The Postgres backend uses pgvector for similarity. Set up the schema once before opening:

import { NuVectorPostgres } from "@nusoft/nuvector/postgres";

await NuVectorPostgres.install({
  connectionString: process.env.DATABASE_URL!,
  dimensions: 768,
});

Postgres consumers must npm install pg themselves — the dependency is loaded dynamically so non-postgres users pay zero cost.

Public API

All methods on NuVector:

| Method | Purpose | |---|---| | open(config) | Open or create a store | | upsert(record) | Insert/update one record (idempotent on id) | | upsertBatch(records) | Atomic bulk insert — used by NuWiki on every article publish | | retrieveContext(query) | Generic vector search with full filter control. Use this for arbitrary record kinds (e.g. document_chunk, custom kinds) — it respects whatever kind filter you pass. | | searchKnowledge(request) | NuWiki-shaped four-layer retrieval (Layer 1 article summaries → Layer 2 section drill-down → Layer 4 graph). Only returns nuwiki_article_summary records — do not call this if your records are not NuWiki articles; use retrieveContext instead. | | searchArticles(request) | Layer 1 only — article summaries | | searchSectionsInArticles(request) | Layer 2, filtered to specific article IDs | | searchCitations(request) | Layer 3 precision search | | traverseFromArticle(request) | Layer 4 BFS graph traversal | | remember(record) | Write a ProvenanceRecord | | fetch(ids) | Direct fetch by id | | delete(query) | GDPR right-to-erasure (idempotent) | | snapshot(path) | Backup | | restore(path) | Restore from snapshot | | subscribeToInvalidations(handler) | Real-time invalidation events | | close() | Flush and close cleanly |

Subpath imports

import { NuVectorGraph } from "@nusoft/nuvector/graph";       // Layer 4 graph operations
import { NuVectorPostgres } from "@nusoft/nuvector/postgres"; // Postgres install/status helpers
import { NuVectorGNN } from "@nusoft/nuvector/gnn";           // Optional GNN reranker
import { NuVectorLite } from "@nusoft/nuvector/lite";         // Pure-JS fallback (stub at v0.1.0)

What NuVector is not

Not the source of truth for any operational record. The application database owns attendance, incidents, care plans, communications. NuVector only retrieves.
Not an embedding service. Consumers supply embeddings — model choice is the consumer's GDPR/data-residency call.
Not an LLM host.
Not flat retrieval. A naïve embed-everything-and-cosine architecture fails at fleet scale.

Education / MIS context

NuVector was designed alongside Sensight's needs. For SEN and mainstream MIS workflows it supports:

resolving references to pupils, classes, staff, interventions, attendance context
retrieving supporting evidence from plans, notes, reviews, interventions
producing context packs for AI assistants without leaking PII into the prompt
learning from corrections without becoming the statutory record

The architectural rule: NuVector never holds operational truth. The MIS relational or event database remains the source of truth.

Documentation

The full contract: nuos/docs/contracts/nuvector.md
The build catalogue: nuos/docs/build/

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@nusoft/nuvector

Install

The four layers

Storage backends

Public API

Subpath imports

What NuVector is not

Education / MIS context

Documentation

License