@vantageos/data-lake
v0.3.1
Published
Convex Component — RAG + embeddings + intake générique (memories, episodes, search) — substrate data-lake for VantagePeers, Vantage Immo, and downstream BUs
Downloads
835
Maintainers
Readme
@vantageos/data-lake
Convex Component: RAG + embeddings + intake générique substrate for VantagePeers, Vantage Radar, Vantage Immo, and downstream BUs.
Install
pnpm add @vantageos/data-lakeMount
In your consumer's convex/convex.config.ts:
import { defineApp } from "convex/server";
import dataLake from "@vantageos/data-lake/convex.config.js";
const app = defineApp();
app.use(dataLake, { name: "dataLake" }); // VantagePeers
// or
app.use(dataLake, { name: "radarDataLake" }); // Vantage Radar (isolated namespace)
export default app;The same Component can be mounted multiple times under distinct names in different deployments — each mount gets its own isolated tables and RAG namespace.
Public API
Write path — memoriesV1.storeMemory + episodesV1.storeEpisode (v0.3.0+, BREAKING)
Both mutations REQUIRE embedding: number[] at runtime. The validator types
it as optional for forward compatibility, but the handler throws a clear
error if omitted (loud failure > silent inconsistency — Day 79 doctrine).
Indexing is inline in RAG (chunks-form, no embedding compute inside the
Component — consistent with ADR Phase E.0 "no use node in Components").
// Host computes embedding via its own aiClient (Node-side OK in host)
const embedding = await aiClient.embed(content);
await ctx.runMutation(
components.<mountName>.component.memoriesV1.storeMemory,
{ content, namespace, type, createdBy, embedding },
);Calling storeMemory / storeEpisode WITHOUT embedding throws :
embedding required — host must compute via aiClient.embed before storeMemory call. See ADR Phase E.0 + README contract.
Supersede / softDelete / TTL-expire paths re-use the stored embedding
field on the memory row (added to the Component schema in v0.2.3) — no
host recompute needed for those operations.
hybridSearchV1 (v0.2.2+)
Access path: components.<mountName>.component.<file>.<export> — the
.component. intermediate is required because convex.config.ts lives at
the package root while function modules live in component/*.ts. Convex
codegen nests modules under a component key in this layout. Example:
components.radarDataLake.component.searchV1.hybridSearchV1.
(If you compared with @convex-dev/rag which has no .component. wrapper:
that package colocates its convex.config.ts inside the same directory as
its function files, so codegen flattens. Different layout, different path.)
Hybrid search (vector + BM25 RRF fusion) over the data lake namespace.
const res = await ctx.runAction(
components.dataLake.component.searchV1.hybridSearchV1,
{
query: "the search text",
queryEmbedding: await aiClient.embed("the search text"), // host-computed
namespace: "global",
type: "user", // optional, generic string
onlyLatest: true, // default true
limit: 10,
vectorWeight: 1,
textWeight: 1,
vectorScoreThreshold: 0.15,
minRrfRank: undefined,
extraFilters: [{ name: "bu", value: "vantage-immo" }], // optional
},
);
// → { results: [{id, rrfScore, content, metadata}], totalMatches, latencyMs }queryEmbedding is required. The Component cannot compute embeddings
itself (Convex CLI 1.39.1 forbids "use node" in Components — ADR Phase
E.0). Compute it host-side via your own aiClient.ts action and pass the
Array<number> here.
Other actions
| Action | Purpose |
|---|---|
| memoriesV1.store | Insert / supersede a memory with optional embedding |
| memoriesV1.validateIds | Validate cross-Component memory id references |
| episodesV1.storeEpisode | Store an episode tied to memories |
| searchV1.recall | Pure vector search (semantic) |
| searchV1.textSearch | Pure BM25 text search |
| searchV1.hybridSearch | Legacy hybrid action (v0.1.0 signature, kept for VP host) |
| searchV1.searchFixPatterns | Hydrated vector search over the fixpatterns namespace |
References
- API surface conventions:
decisions/c1-namespacing-convention-2026-05-21.md - Public APIs design:
decisions/c1-public-apis-design-2026-05-21.md - Embedding host-responsibility ADR:
decisions/c1-phase-e0-component-api-internal-change-2026-05-21.md hybridSearchV1contract: co-designed with xi (msgjn77yys9t4vjxcqc9v8x2j8xqd879bx9), Day 79 round 2 review of Vantage Radar v2.
License
FSL-1.1-Apache-2.0 — see LICENSE at repo root.
