@onmars/lunar-kb
v0.15.1
Published
Knowledge Base skill for Lunar — opt-in citation-grounded RAG over pgvector (schema-per-moon)
Downloads
1,639
Maintainers
Readme
@onmars/lunar-kb
Citation-grounded knowledge base skill for Lunar.
Status: scaffolding only — Fase 0. No functional code yet. This package lays out the directory structure, types, and DDL that Fase 1 will implement.
Purpose
A per-moon RAG system where every answer must cite the source chunk it came from. Designed so Artemis (veterinary) and Deimos (research) can answer with traceable provenance — and refuse to answer when there's no evidence.
Canonical spec lives in Craft:
Lunar KB — Knowledge Base con citación obligatoria (block id
8321e434-55e8-76ae-23c1-6fc6a4ce7120). Read it before implementing anything.
Design in one paragraph
Opt-in skill activated per-moon in config.yaml under skills.kb. Data lives
in Postgres + pgvector, isolated per moon by schema-per-moon (artemis_kb,
deimos_kb, …) inside a user-configurable DB. Pipeline (planned): LlamaIndex.TS
orchestrates Docling (Python sidecar) for PDF parsing → chunker with Anthropic
Contextual Retrieval → embeddings → pgvector index. Query path: hybrid search
→ Cohere 3.5 rerank (or LLM-as-judge fallback) → answer generation with
mandatory citations → grounding check.
Activation (opt-in)
Not enabled by default. A moon opts in by adding a skills.kb block:
# lunar.config.yaml
moon: artemis
skills:
kb:
mode: sources-only # or "hybrid" for research moons
database:
url: postgres://...
schema: artemis_kb
embeddings:
provider: openai # required if moon provider is Claude (no embeddings)
model: text-embedding-3-small
# parser, reranker, contextual_retrieval, generation all inherit
# from the moon's LLM provider (cheapest model per component) by defaultWithout a skills.kb block Lunar boots with no KB tables, no validations, no
skill exposed to the agent.
Package layout
packages/kb/
├── src/
│ ├── index.ts ← public exports (types + KBProvider interface)
│ ├── types/ ← config, query, document types
│ ├── lib/
│ │ ├── embeddings/ ← provider adapters (openai, voyage, ollama, …)
│ │ ├── reranker/ ← cohere + llm-as-judge fallback
│ │ ├── chunker/ ← contextual-retrieval chunker
│ │ └── ingestion/ ← doc → parse → chunk → embed → insert
│ └── __tests__/
├── migrations/
│ └── 001_init.sql ← DDL templated per-moon schema
└── sidecar/
└── README.md ← Docling Python sidecar design notesFase 1 — pendiente (TODO)
- Implement
KBProvider(ingest, query, destroy). - Schema materializer that runs
migrations/*.sqlagainst the moon's schema. - Embeddings adapters (OpenAI, Voyage, Gemini, Ollama).
- Reranker adapters (Cohere, LLM-as-judge).
- Docling sidecar client (HTTP local).
- Grounding check (regex + source-id lookup).
- Dual citation format (
[Kumar 2024, p347]human +[kumar-endo#ch42-p347]technical). - Bindings into
@onmars/lunar-coreskill system.
License
MIT — onMars Tech
