@pugi/plugin-multi-corpus-rag
v0.1.0-alpha.2
Published
Pugi multi-corpus RAG plugin - persona-routed Anvil knowledge search injected via experimental.chat.system.transform with LRU+TTL cache and license attribution.
Downloads
299
Maintainers
Readme
@pugi/plugin-multi-corpus-rag
Pugi multi-corpus retrieval-augmented generation plugin. Persona-routed Anvil knowledge search with LRU+TTL cache, license attribution, and four
pugi.rag.*tools.
Part of the Pugi 1.0 soft fork sprint (ADR-0081).
What it does
- Caches the user's latest message per session via
chat.message(Day 5 workaround for the missinguserMessageTextonexperimental.chat.system.transform). - Resolves the active persona through an injectable callback (default
reads
PUGI_ACTIVE_PERSONAenv or<workspaceRoot>/.pugi/active-persona). - Fans out to the persona's whitelisted Anvil workspaces in parallel
via
POST /v1/knowledge/search, with a hard fan-out timeout (4s default). - Renders a system-prompt block with per-chunk clipping, lowest-score eviction under token budget, and an SPDX-grouped license footnote.
- Exposes four tools so the model can introspect the corpus:
pugi.rag.search,pugi.rag.workspaces,pugi.rag.policy,pugi.rag.cache_stats.
Anvil contract gap
Anvil's production endpoint POST /v1/knowledge/search is single-workspace
per call. The workspace id is derived from SHA-256(bearer) to defend the
cross-tenant boundary (see apps/gateway/src/controllers/knowledge.controller.ts
P0-A note in the ai-engine repo). To honour the multi-corpus persona policy
without an Anvil change, this plugin:
- Accepts
workspaceTokens: Record<string, string>mapping workspace slug to bearer. Fans out N parallel calls, one per token. - Falls back to a single call against
apiKeyif no per-workspace tokens are set, and logs a one-time warning. - Treats the (current) absence of per-chunk scores by deriving priority from the persona policy order: earlier workspace, higher priority, retained first during token-budget eviction.
A follow-up Anvil PR can add POST /v1/knowledge/search-multi that accepts a
single bearer and a workspace allowlist, at which point this plugin can
collapse to one HTTP call per turn.
Install
pnpm add @pugi/plugin-multi-corpus-ragUsage
// pugi.config.ts
export default {
plugin: [
[
'@pugi/plugin-multi-corpus-rag',
{
anvilBase: 'https://anvil.pugi.io/v1',
apiKey: process.env.PUGI_API_KEY,
workspaceTokens: {
web_platform: process.env.PUGI_TOKEN_WEB_PLATFORM,
ui_registry: process.env.PUGI_TOKEN_UI_REGISTRY,
a11y_baseline: process.env.PUGI_TOKEN_A11Y_BASELINE,
},
tokenBudget: 2000,
cacheTtlMs: 600_000,
},
],
],
};Options
| Option | Default | Description |
| ----------------------- | -------------------------------- | -------------------------------------------------------------------------------------------- |
| anvilBase | https://anvil.pugi.io/v1 | Anvil HTTP base URL. |
| apiKey | process.env.PUGI_API_KEY | Default bearer token used when a workspace is not present in workspaceTokens. |
| workspaceTokens | {} | Per-workspace bearers used to fan out search calls. |
| embeddingModel | bge-m3 | Embedding model id passed to Anvil when applicable. |
| defaultTopK | 5 | Default chunks-per-workspace ceiling. |
| workspacePolicy | persona-routed | Routing mode: persona-routed, all-allowed, or custom. |
| personaWorkspaceMap | built-in 16-persona table | Override map persona slug to workspaces. |
| personaResolver | env + .pugi/active-persona | Callback to resolve the active persona by sessionID. |
| tokenBudget | 2000 | Approximate token budget for the injected system-prompt block. |
| cacheTtlMs | 600_000 (10 min) | LRU cache TTL. |
| cacheMaxEntries | 1000 | LRU cache max entries. |
| maxRetries | 3 | Retries on transient 5xx and network errors. |
| retryBaseMs | 250 | Exponential backoff base in ms. |
| licenseOverrides | {} | Override the SPDX license map per workspace slug. |
| warnOnMissingAuth | true | When true, log a one-time warning if no auth is configured. |
Persona to workspace matrix
| Persona slug | Workspaces |
| ------------- | -------------------------------------------------------------------------------- |
| pugi | web_platform, ui_registry, a11y_baseline |
| hiroshi | framework_nestjs, framework_nextjs, framework_prisma, web_platform |
| marcus | architect-playbook, engineering-standards, ai_sdk_cookbook, db_vector |
| olivia | landing_templates, design_radix, ui_registry |
| priya | security_baseline, a11y_baseline |
| omar | engineering-standards, web_platform |
| diego | devops-handbook, framework_mastra |
| sigma | security_baseline, engineering-standards |
| sofia | mira_knowledge, bas-business-patterns-2026-05-23 |
| yuki | ui_registry, design_radix, a11y_baseline |
| daniel | engineering-standards, devops-handbook |
| tom | framework_nextjs, ui_registry, design_radix, web_platform |
| anika | engineering-standards, security_baseline |
| hannah | bas-business-patterns-2026-05-23 |
| mateo | mira_knowledge, bas-business-patterns-2026-05-23 |
| lena | bas-business-patterns-2026-05-23 |
Note: mira_knowledge is the engineering slug for the internal Pugi seed
corpus. Customer-facing brand rewriting is owned by @pugi/plugin-voice-rules;
this plugin does not import any other @pugi/plugin-* (ADR-0081).
License attribution
Each workspace ships with an SPDX identifier. The system-prompt footnote groups workspaces by SPDX id and only renders for externally licensed sources (proprietary slugs are omitted unless they are the only sources):
Sources retrieved via Anvil RAG. License attributions: MIT (framework_nestjs, framework_nextjs); Apache-2.0 (framework_prisma).| Workspace | SPDX |
| ------------------------------------ | --------------------- |
| web_platform | CC-BY-SA-2.5 |
| security_baseline | CC-BY-SA-4.0 |
| ui_registry | MIT |
| a11y_baseline | W3C-Document-1.0 |
| framework_nestjs | MIT |
| framework_nextjs | MIT |
| framework_mastra | Apache-2.0 |
| framework_prisma | Apache-2.0 |
| framework_grammy | MIT |
| ai_sdk_cookbook | MIT |
| db_vector | PostgreSQL |
| landing_templates | MIT |
| design_radix | MIT |
| marketing_frameworks | OGL-UK-2.0 |
| seo_geo | Apache-2.0 |
| mira_knowledge | PROPRIETARY |
| bas-business-patterns-2026-05-23 | PROPRIETARY |
| architect-playbook | PROPRIETARY |
| engineering-standards | PROPRIETARY |
| devops-handbook | PROPRIETARY |
Override per workspace via licenseOverrides.
Tools
| Tool | Args | Returns |
| ------------------------ | --------------------------------------------- | ---------------------------------------------------------------------- |
| pugi.rag.search | { query, workspaces?, topK?, persona? } | Fan-out search result with cacheHit flag and per-workspace failures. |
| pugi.rag.workspaces | {} | Workspace metadata list (id, license, tier, description). |
| pugi.rag.policy | { persona? } | Persona to workspaces whitelist. |
| pugi.rag.cache_stats | {} | LRU snapshot (size, hits, misses, hitRate, ttlMs, maxEntries). |
Hook surface
chat.message(Day 5 workaround) caches the last user turn per session.experimental.chat.system.transformpulls the cached turn, resolves the active persona, fans out to that persona's workspaces, formats the result, and pushes the block intooutput.system. The fan-out is wrapped in anAbortController-style deadline so a slow Anvil cannot stall the chat.
Disk policy override
Place a JSON file at <workspaceRoot>/.pugi/rag-policy.json (or
~/.pugi/rag-policy.json as a fallback) to override the built-in policy:
{
"version": 1,
"personas": {
"hiroshi": ["framework_nestjs", "custom_workspace"]
},
"licenses": {
"custom_workspace": "BSD-3-Clause"
}
}Programmatic personaWorkspaceMap and licenseOverrides options win over
the on-disk file; the disk file wins over the built-in defaults.
Hybrid retrieval roadmap
- Phase 1 (shipped): persona-routed fan-out across single-workspace Anvil calls.
- Phase 2 (deferred): hybrid BM25 + embedding rerank via local FTS5 and Anvil.
- Phase 3 (deferred): graph-augmented retrieval that cross-references
@pugi/plugin-codegraphsymbols.
License
MIT. See LICENSE.
