claude-context-rag
v0.2.2
Published
Persistent RAG layer for Claude Code sessions
Downloads
585
Readme
claude-context-rag
Persistent RAG memory layer for Claude Code sessions.
Claude Code starts every session cold — no memory of past decisions, files explored, bugs fixed, or architectural choices made. This tool fixes that by automatically capturing your session context into a local vector database and injecting relevant history back into each new session.
How It Works
End of Claude Code session
↓
Stop hook → claude-context-rag ingest
↓
Chunks extracted → embedded locally → stored in LanceDB
Start of new Claude Code session
↓
SessionStart hook → claude-context-rag inject
↓
Similar chunks retrieved → formatted as markdown
↓
Claude starts knowing what you were working onEverything runs locally. No API keys, no external servers, no data leaving your machine.
Prerequisites
- Node.js 18+
- Claude Code (latest)
That's it. The embedding model (Xenova/all-MiniLM-L6-v2, ~25 MB) is downloaded automatically on first use and cached at ~/.cache/huggingface/. No Ollama or separate model server needed.
Installation
npm install -g claude-context-ragThen run setup once:
claude-context-rag initinit creates ~/.claude-context-rag/ with your config and database, and registers two hooks in ~/.claude/settings.json — a Stop hook to ingest sessions and a SessionStart hook to inject context. It's idempotent; safe to run again.
After init, the system is fully automatic.
CLI Reference
| Command | Description |
|---------|-------------|
| claude-context-rag init | One-time setup: create dirs, write config, register hooks |
| claude-context-rag ingest [--session <path>] | Ingest a session file into the vector store |
| claude-context-rag inject [--project <dir>] | Output injection JSON for a project (used by the hook) |
| claude-context-rag search <query> [--project <dir>] [--limit <n>] | Semantic search across stored context |
| claude-context-rag preview [--project <dir>] [--session <path>] | Preview injection output or extracted chunks (no writes) |
| claude-context-rag stats | Show chunk counts by type and project |
| claude-context-rag clear [--project <dir>] [--all] | Delete stored context for a project or all projects |
Examples
# Search your context
claude-context-rag search "why did we choose LanceDB"
claude-context-rag search "auth middleware" --project /Users/you/myapp --limit 5
# See what Claude will see at the start of a session
claude-context-rag preview --project /Users/you/myapp
# Preview what chunks would be extracted from a past session (no writes)
claude-context-rag preview --session ~/.claude/projects/<hash>/<session-id>.jsonl
# Check how much context is stored
claude-context-rag stats
# Manually ingest a specific session file
claude-context-rag ingest --session ~/.claude/projects/<hash>/<session-id>.jsonl
# Delete context for one project
claude-context-rag clear --project /Users/you/myappWhat Gets Stored
The session parser reads Claude Code's .jsonl session logs and extracts chunks by type:
| Chunk Kind | What It Captures |
|-----------|-----------------|
| decision | Claude's reasoning + what was asked — the "why" behind choices |
| error_fix | Failed commands and their resolution — prevents repeat mistakes |
| tool_call | Files written/edited, bash commands run |
| file_exploration | Files read or searched |
| conversation_turn | Substantial Q&A exchanges |
What Gets Created on Your Machine
~/.claude-context-rag/
├── config.json # Settings
└── db/ # LanceDB vector database
~/.cache/huggingface/ # Embedding model cache (~25 MB, downloaded once)
~/.claude/settings.json # Claude Code hooks added by initA typical session of 50–100 exchanges adds roughly 300–600 KB to the database.
Configuration
~/.claude-context-rag/config.json is created by init with these defaults:
{
"ollamaUrl": "http://localhost:11434",
"embeddingModel": "nomic-embed-text",
"vectorDim": 768,
"maxChunksPerSession": 200,
"ingestedSessions": []
}| Field | Default | Notes |
|-------|---------|-------|
| maxChunksPerSession | 200 | Lower to reduce storage, raise for very long sessions |
| ingestedSessions | [] | Auto-managed. Remove an ID to force re-ingest of that session |
Troubleshooting
No results from search / stats shows 0 chunks
The Stop hook may not have fired yet (requires ending a Claude Code session). Manually ingest a past session:
claude-context-rag ingest --session ~/.claude/projects/<project-hash>/<session-id>.jsonlHooks not firing
# Check hooks are in settings.json
cat ~/.claude/settings.json | grep claude-context-rag
# Re-run init if missing (safe to run again)
claude-context-rag initForce re-ingest a session
Remove the session ID from ingestedSessions in ~/.claude-context-rag/config.json, then run ingest again.
Project Structure
claude-context-rag/
├── bin/
│ └── claude-context-rag.js # Entry point shebang → dist/cli.js
├── src/
│ ├── cli.ts # CLI wiring (commander subcommands)
│ ├── paths.ts # All filesystem paths in one place
│ ├── config.ts # Config schema, read/write helpers
│ ├── parser/
│ │ ├── types.ts # TypeScript types for .jsonl entries
│ │ ├── reader.ts # Streaming .jsonl reader
│ │ └── chunker.ts # Extracts chunks from raw session entries
│ ├── embedder/
│ │ └── local.ts # @xenova/transformers (Xenova/all-MiniLM-L6-v2, 384-dim)
│ ├── store/
│ │ ├── schema.ts # LanceDB row shape (ChunkRecord)
│ │ └── lancedb.ts # insert, search, clear, stats
│ └── commands/
│ ├── init.ts
│ ├── ingest.ts
│ ├── inject.ts
│ ├── inject-format.ts
│ ├── search.ts
│ ├── preview.ts
│ ├── stats.ts
│ └── clear.ts
└── dist/ # Compiled JS (generated by build)Development
# Run from TypeScript source directly
npm run dev -- search "why did we choose LanceDB"
npm run dev -- preview --project /Users/you/myproject
# Build
npm run build
# Run built version
node dist/cli.js <command>Key design decisions
- No server required — LanceDB is embedded (files on disk, not a process)
- No API keys — embeddings run in-process via
@xenova/transformers; nothing leaves the machine - Silent failure at session start —
injectnever crashes a Claude session; always outputs{"continue": true} - Deduplication by session ID —
ingestis idempotent; running it multiple times is safe - Project isolation — all queries filter by absolute project path
Sharing With Teammates
Context is personal and local — not shared between developers. Each person builds their own database from their own sessions.
To onboard a teammate:
npm install -g claude-context-rag
claude-context-rag initTheir sessions will be captured automatically from that point on.
