cairn-index
v1.2.1
Published
Local hybrid index (FTS5 + vector embeddings + AST graph) over a single sqlite file. Curate, ingest, retrieve.
Maintainers
Readme
cairn
Local hybrid index for things you intentionally collect — web pages, codebases, files, raw text. Hybrid retrieval (FTS5 + vector embeddings, RRF fused) over a single sqlite file. Lightweight, fast, no daemons.
pnpm add cairn-index # npm install cairn-index — the bins are cairn and cairn-mcpRequires ollama running locally with nomic-embed-text pulled. See setup for the full prereq list and OpenCode wiring.
A daemon-free embedded runtime is available as an opt-in (in-process via node-llama-cpp, GGUFs auto-download to ~/.cairn/models on first use, ~785 MB). Pass runtime: 'embedded' (lib) or set CAIRN_RUNTIME=embedded (CLI). Set CAIRN_OFFLINE=1 to refuse the auto-download and require pre-cached models — useful for air-gapped or strict-egress environments.
quick start
Library:
import { Cairn } from 'cairn-index'
const cairn = new Cairn() // defaults to ~/.cairn, ollama @ 127.0.0.1:11434
await cairn.ingest.add({ kind: 'code', path: './src', label: 'my-project' })
const hits = await cairn.retrieve.search('how does the chunker handle overlap', { k: 5 })
cairn.close()CLI:
cairn add ./src --label my-project
cairn search "how does the chunker handle overlap" -k 5
cairn graph "fee invariant"
cairn refresh allwhat cairn is for
Local-first retrieval grounding for an LLM. You curate what's indexed (no automatic crawling), cairn add brings it in, and either you or a model running over MCP can query the result. Five query surfaces:
- Hybrid chunk search (
search) — FTS5 + vector embeddings fused via RRF. Default mode; returns ranked text chunks. - Knowledge graph (
graph) — entities (functions, structs, concepts) and edges (calls,depends_on,mitigates,references,verifies) extracted from code (tree-sitter, AST-based) and markdown (LLM, hash-gated). - Composed retrieval (
ask) — hybrid search + per-hit entity context in one call. - Shortest path (
path) — BFS between two entities via batched layer fetch (one SQL per BFS layer). - Tag-filtered retrieval (
tags,--tag) — concept entities carry free-form LLM-emitted tags; filtersearch/ask/graphby tag.
Cross-source linking (cairn link sdk program) lets you resolve names across two related sources — SDK calling its on-chain program is the canonical case. Soft-delete + FK cascades keep the graph clean across refreshes and removals.
architecture
- data layer —
better-sqlite3+sqlite-vec. Flat tables (sources,files,chunks,entities,edges,source_links) plus FTS5 + vec0 virtual tables. FK cascades fromsourcesthrough entities into edges; triggers keepchunks_vecandentities_vecin sync. - ingestion — gitignore-aware walker → boring chunkers (code: 60-line / 10-overlap; text: ~2000-char / paragraph-snapping) → batch embed via ollama → atomic insert.
- graph — tree-sitter (Rust, TypeScript / TSX, Python) extracts entities and per-fn AST call/ref maps. Edge layer derives
calls/depends_onparse edges intra-file, cross-file, and cross-source (when sources are explicitly linked viacairn link). Optional LLM pass over markdown emitsmitigates/references/verifiesdoc edges (capped confidence 0.7, hash-gated) plus free-form tags on concept entities (slugified, multi-tag, queryable via--tag). Soft-delete on entities; clean rebuild on every refresh. - retrieval — query embedded once; FTS5 + vec each return top-50; reciprocal rank fusion with
k=60; hydrate via narrow indexed seeks. No reranker, no tuning. Empty-query short-circuit. - interface —
Cairnclass wires per-concern providers (Db,Embed,Chat,Ingest,Retrieve).EmbedandChatare runtime-agnostic: each takes anEmbedRuntime/ChatRuntime(Ollama or in-process llama.cpp), so swapping runtimes is one line. CLI for ingestion, graph queries, and admin (add/list/search/ask/graph/path/tags/refresh/reindex/link/unlink/links/remove). MCP server exposessearch/list/add/graph/ask/path/tags/refreshso models can both query and maintain the index. Mutating opsremove/link/unlink/reindex/initare intentionally CLI-only — destructive or topology-changing actions require a human at the terminal.
configuration & safety (v1.2+)
Cairn is a curated index — you trust what you put in, and you control the surface around ingestion via env vars. Defaults are sensible for a single-user developer setup; the env vars matter in shared, agent-driven, or compliance-sensitive deployments.
| Env var | Default | Purpose |
|---|---|---|
| CAIRN_OFFLINE | unset | When 1 or true, blocks fetchWeb (no cairn add <url>) and blocks non-local model resolution (no Hugging Face GGUF auto-download). Localhost ollama still allowed. |
| CAIRN_ALLOWED_ROOTS | unset | Comma-separated absolute paths. When set, cairn add rejects any local path (code / file / pdf) outside these roots. Defense-in-depth for MCP-driven ingestion. |
| CAIRN_MAX_INGEST_FILES | 10000 | Aborts directory ingestion before any chunking/embedding if file count exceeds this. CLI --force bypasses; MCP intentionally has no force option. |
| CAIRN_MAX_INGEST_BYTES | 500 MB | Same shape as the file cap, on total bytes. |
Network egress. Three categories, all bounded: (a) explicit cairn add <url> (user-initiated web ingest), (b) localhost ollama only when CAIRN_RUNTIME=ollama, (c) Hugging Face GGUF download on first use only when CAIRN_RUNTIME=embedded and the model isn't pre-cached. CAIRN_OFFLINE=1 blocks (a) and (c); (b) stays available because it's localhost.
Trust model. Cairn doesn't auto-crawl — every source enters via an explicit cairn add (CLI, library, or MCP) by you or an agent you authorized. Indexed content is queryable later, including by future MCP-connected agents — that is the point. For sensitive content, isolate it in a separate dbPath. The MCP host (Claude Desktop, OpenCode, etc.) controls which agents can connect; cairn assumes that gating is done host-side.
stack
| dep | purpose |
|---|---|
| better-sqlite3 | sync sqlite driver |
| sqlite-vec | vector search extension |
| tree-sitter + tree-sitter-rust / tree-sitter-typescript / tree-sitter-python | AST entity + call extraction |
| linkedom | HTML → text |
| @modelcontextprotocol/sdk | MCP server |
| zod | tool input + LLM output schemas |
Embeddings via ollama (nomic-embed-text, 768-dim) — required prerequisite. Doc-extraction uses an ollama chat model (default Qwen3-0.6B-GGUF:UD-Q8_K_XL); skipped silently if not pulled. The embedded runtime substitutes equivalent in-process GGUFs. See setup.
docs
- setup — prereqs, install, CLI, cross-source linking, OpenCode wiring, troubleshooting.
- design — data model, retrieval pipeline, decisions, what's out of scope.
- graph — knowledge-graph layer (entities, edges, soft-delete, AST extraction) layered over the chunk index. Additive; chunk search is unchanged.
- next — completed roadmap with retrospectives plus the remaining open items (more language grammars, impl-method ID disambiguation).
- v1.1 tags — concept-tag design + LLM emission + storage shape.
- enterprise — pitch / reference for an enterprise-shaped variant on Postgres + Azure OpenAI.
