ralph-hero-knowledge-index

v0.1.49

Published

10 days ago

Semantic search over a personal knowledge corpus (`thoughts/` plus any other markdown roots). Uses SQLite + `sqlite-vec` for local embeddings and exposes a stdio MCP server to Claude Code.

0High
0Medium
0Low

cdubiel08

ralph-knowledge

Semantic search over a personal knowledge corpus (thoughts/ plus any other markdown roots). Uses SQLite + sqlite-vec for local embeddings and exposes a stdio MCP server to Claude Code.

Quick start

cd plugin/ralph-knowledge
npm install
npm run build
npm run reindex                     # index default root (../../thoughts)
npm run reindex -- /path/to/roots   # CLI override, see "Configuration"

A SQLite file is written to ~/.ralph-hero/knowledge.db by default.

Configuration

ralph-knowledge reads configuration from four sources. Each source can be missing; later sources fill in only what earlier sources did not provide. The precedence for roots (directories to index), from highest to lowest, is:

CLI positional arguments (npm run reindex -- /a /b /c)
RALPH_KNOWLEDGE_DIRS environment variable (comma-separated)
roots[] in ~/.ralph/knowledge.config.json
Fallback: ../../thoughts (relative to the current working directory)

dbPath precedence is independent:

CLI positional argument ending in .db
RALPH_KNOWLEDGE_DB environment variable
dbPath in ~/.ralph/knowledge.config.json
Default: ~/.ralph-hero/knowledge.db

`~/.ralph/knowledge.config.json`

Create this file to persist multi-root setups and global ignore patterns. The path can be overridden via the RALPH_KNOWLEDGE_CONFIG env var.

{
  "roots": [
    "~/projects/ralph-hero/thoughts",
    "~/projects/landcrawler-ai/thoughts",
    "~/notes"
  ],
  "ignorePatterns": [
    "**/drafts/**",
    "**/worktrees/**",
    "*.bak"
  ],
  "dbPath": "~/.ralph-hero/knowledge.db"
}

All fields are optional. Tilde (~) prefixes in roots[] and dbPath are expanded to the user's home directory at load time. Malformed JSON, non-object top levels, and non-string entries are ignored with a warning.

On startup, ralph-knowledge logs which source provided the roots, e.g.:

Using roots from: config

Ignoring files

Per-root .ralphignore files use full gitignore syntax and are layered on top of the config's ignorePatterns and the following default globals (always applied):

.claude/
node_modules/
dist/
.git/
*.log

Example .ralphignore at the top of a root directory:

# Skip a whole subtree
.claude/worktrees/**

# Skip drafts but keep the index
drafts/**
!drafts/INDEX.md

# Skip anything ending in .bak
*.bak

Patterns behave exactly like .gitignore:

**/name/** matches name/ at any depth.
A leading ! negates an earlier match, re-including a path.
A trailing / makes the pattern directory-only.

Directories whose names start with . or _ are also always skipped, as a fast-path before any matcher is consulted.

Choosing between `knowledge_search` and `knowledge_recall`

ralph-knowledge exposes two retrieval MCP tools that wrap the same underlying hybrid search. They differ on who decides the tier policy:

| Tool | When to use | Tier handling | |------|-------------|---------------| | knowledge_search | Power-user / explicit path. You know the tier and want full control over rerank, lambda, return_diagnostics, chunk metadata, etc. | You pass memory_tier explicitly (doc, raw, reflection, wiki, or any — default any). | | knowledge_recall | Default for agents and skills. You declare your role and the tool picks the right tiers. | A role-keyed policy fans out one rerank-enabled hybrid.search() per tier in the role's list, then merges and re-ranks. |

Role -> tier policy

knowledge_recall(query, role, ...) follows this fixed policy map:

| Role | Tiers (priority order) | Intent | |------|------------------------|--------| | researcher | raw, reflection, doc | Recovery of unfiltered observations + synthesized insights + curated research. Excludes wiki (we are looking for things the wiki does not already cover). | | planner | reflection, wiki, doc | Bias toward synthesized insights + canonical curated knowledge; excludes raw observations to keep the planning frame stable. | | implementer | wiki, doc | Only canonical references — never raw memory or speculative reflections. Keeps implementations grounded in agreed-upon truth. | | reviewer | wiki, doc | Same constraints as implementer — review against the canonical surface, not the raw or speculative tiers. | | triager | doc, wiki | Doc-first for issue context, wiki as fallback. |

Cost notes

knowledge_recall always runs the cross-encoder reranker (rerank=true) because role-aware retrieval is the most context-sensitive call path in the surface. Expect a one-time ~0.5-1 s cold-start (ONNX model load on the first call after process boot) and ~25-45 ms per (query, doc) pair on warm calls.
Each tier sub-query targets <50 ms; a 3-tier fanout totals <150 ms before reranking.
If a tier sub-query throws (e.g., transient DB lock), knowledge_recall logs the error to stderr and continues with the remaining tiers — degraded results rather than a hard failure.

Power-user override

Skills can mix both tools. For example, /ralph-hero:plan calls knowledge_recall(role="planner", ...) for the default tier-balanced context gather, and ALSO calls knowledge_search(type="research", ...) for an explicit artifact lookup where it needs a precise type filter. Keeping both tools in a skill's allowlist is the recommended pattern.

`knowledge_expert` — domain-keyed memory bundles

knowledge_expert(domain, issue_number, ...) returns a curated context bundle for a named domain — wiki entries, recent reflections, and prior outcomes — so sub-agents become per-domain experts via memory rather than per-domain prompts. It is the domain-keyed companion to knowledge_recall's role-keyed retrieval: role decides which tiers to surface; domain decides which slice of the corpus.

Signature

knowledge_expert(
  domain: string,             // Tag to match (e.g. "auth", "memory-tiers", "ralph-knowledge")
  issue_number: number,       // GitHub issue on whose behalf this call is made — required for telemetry
  limit?: number,             // Max entries per bucket. Default 5.
  recency_window_days?: number, // Reflection age cutoff in days. Default 30.
  path_prefix?: string,       // Optional secondary filter: only docs whose path starts with this prefix.
  session_id?: string,        // Team/hero session ID — passed through to the outcome event.
)

Domain matching uses the tags table (frontmatter tags: arrays are the primary signal). path_prefix is a secondary narrowing filter — not a replacement for tags. Pass "thoughts/shared/" to restrict to the shared corpus, for example.

Return shape

{
  "query_id": "uuid-v4",
  "domain": "auth",
  "wiki": [ ...DocumentRow ],
  "reflections": [ ...DocumentRow ],
  "prior_outcomes": [ ...OutcomeEventRow ],
  "warning": null
}

| Field | Description | |-------|-------------| | query_id | UUID generated per call. Save this and pass it to knowledge_record_outcome as query_id to correlate downstream outcomes back to this expert call. | | domain | Echo of the requested domain. | | wiki | Up to limit documents with memory_tier = 'wiki' tagged with domain, ordered by date descending. | | reflections | Up to limit documents with memory_tier = 'reflection' tagged with domain and dated within recency_window_days. | | prior_outcomes | Up to limit outcome_events rows whose payload JSON contains domain — pipeline history for this domain. | | warning | Non-null string when both wiki and reflections are empty, suggesting the caller tag existing docs or broaden the domain term. null on a successful hit. |

Telemetry

Every knowledge_expert call writes an outcome_events row with event_type = 'expert_call'. The payload JSON carries query_id, domain, returned_doc_ids, limit, recency_window_days, path_prefix, and warning. This makes per-domain hit rate queryable from day one:

knowledge_query_outcomes({ event_type: "expert_call", aggregate: true })

Pass query_id to knowledge_record_outcome to tie subsequent phase/research outcomes back to the originating expert call:

knowledge_record_outcome({
  event_type: "research_completed",
  issue_number: 1306,
  query_id: "<query_id from knowledge_expert>",
  verdict: "complete"
})

Degradation

knowledge_expert degrades the same way as the other knowledge tools — return an empty bundle with a warning when no matching documents exist; never throw on a valid call. If the domain cannot be determined at call time, callers should skip the call rather than pass an empty string.

Environment variables

| Variable | Purpose | |----------|---------| | RALPH_KNOWLEDGE_CONFIG | Override path to knowledge.config.json (tilde expanded). | | RALPH_KNOWLEDGE_DIRS | Comma-separated list of roots. Beats config, loses to CLI. | | RALPH_KNOWLEDGE_DB | Override SQLite path. Beats config.dbPath, loses to a CLI .db positional. |

Benchmarks

Standalone benchmarks live under benchmark/ — see benchmark/README.md for the directory's conventions (scripts are not part of the published npm package and are not run by vitest).

Reranker benchmark (GH-901)

benchmark/reranker-bench.ts compares two ONNX cross-encoder rerankers loaded via the existing @huggingface/transformers v3 dependency:

onnx-community/bge-reranker-v2-m3-ONNX (int8 quantized) — primary candidate
Xenova/ms-marco-MiniLM-L-6-v2 — speed baseline

For ~44 sample queries spanning the five query intent classes (prior-work topic, plan-by-issue lookup, claim evidence, epic context, hero orientation), the script fetches top-20 RRF candidates, reranks each candidate set with both models, and writes a TSV table with cold-start latency, p50/p95 per-pair latency, batch-of-20 latency, RSS memory delta, and top-3 agreement vs RRF-only. Results land at benchmark/results-YYYY-MM-DD.tsv; the most recent run is checked into the repo.

RALPH_KNOWLEDGE_DB=~/.ralph-hero/knowledge.db \
  npx tsx plugin/ralph-knowledge/benchmark/reranker-bench.ts

The script does not modify hybrid-search.ts — production wiring of a default reranker is a separate followup gated on the benchmark's findings.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme