agents-vault
v1.1.0
Published
Agents Vault — CLI-first retrieval system for coding agents
Maintainers
Readme
AgentVault
CLI-first RAG system for project knowledge.
AgentVault indexes local files into a vector database and exposes a lightweight CLI interface for grounded question answering. It is designed to work alongside coding agents such as OpenClaw, Claude Code, Codex, or any automation tool that can call shell commands.
Instead of repeatedly scanning entire repositories, AgentVault performs semantic retrieval over a pre-built index and returns only the relevant context needed to answer a question.
This significantly reduces token usage, latency, and cost when working with large local projects.
Why AgentVault
Modern coding agents often answer questions by reading large portions of a repository. While effective, this approach becomes expensive and slow when projects grow.
AgentVault introduces a retrieval layer between the agent and the filesystem.
The workflow becomes:
Agent → AgentVault CLI → Vector Retrieval → Grounded Answer
Hybrid Search & RRF Reranking
AgentVault uses a hybrid retrieval pipeline that combines two complementary search strategies:
- Vector search — cosine similarity over embeddings captures semantic meaning, finding chunks that are conceptually related to the query even when exact keywords don't match.
- BM25 keyword search — FTS5 full-text search over chunk content, powered by SQLite's built-in Porter stemming tokenizer. This catches exact terminology and keyword matches that vector search may rank lower.
Both result sets are merged and reranked using Reciprocal Rank Fusion (RRF) with a default constant of k=60. RRF is rank-based rather than score-based, which avoids the problem of incomparable score distributions between vector similarity and BM25 relevance. Each chunk receives a fused score of 1/(k + vectorRank) + 1/(k + bm25Rank), and the top results are passed to the answer provider.
The full retrieval pipeline:
embed query → vector search + BM25 search → RRF rerank → context reduction → grounded answerThis runs automatically — no extra flags needed.
Key advantages:
Token efficient – avoids repeatedly loading entire codebases
Agent friendly – simple CLI interface that agents can call directly
Local-first – no external vector database required
Deterministic outputs – predictable CLI responses for automation
Auditable interactions – all queries saved as markdown logs
AgentVault acts as a knowledge gateway for local repositories, allowing agents to query project knowledge instead of scanning files every time.
Features
configure: interactive provider and model setup.ingest: recursive file discovery, parsing, chunking, embedding, persistence.ask: one-shot grounded answer with citations and markdown export.status: configuration + index health summary.doctor: environment and storage diagnostics.- Local conversation exports to
.conversations/YYYY-MM-DD/*.md.
Requirements
- Node.js
24+ pnpm10+- OpenAI or Azure OpenAI credentials, or Ollama for local models
Installation
npm install -g agents-vaultThen run:
agents-vault --helpQuick Start
agents-vault configure
agents-vault ingest --source ./docs --project my-project
agents-vault ask "How does auth work?" --project my-projectFrom source
pnpm install
pnpm build
node apps/cli/dist/index.js --helpCommand Reference
Top-level help:
agents-vault --helpConfigure provider and model:
agents-vault configureIngest project documents:
agents-vault ingest --source ./docs --project my-project
agents-vault ingest --source ./docs --project my-project --reindexAsk grounded questions:
agents-vault ask "How does configuration work?" --project my-project
agents-vault ask "What is the architecture?" --project my-project --top-k 6Health/status:
agents-vault status --project my-project
agents-vault doctorConfiguration and Secrets
Agent Vault uses two files in ~/.agents-vault/:
agents-vault.json: non-secret config (provider, models, output directory, db path, default project).auth.json: encrypted credentials.auth.key: local encryption key used to decryptauth.json.
Behavior:
agents-vault configureupdates provider/model config and can capture credentials interactively.- At runtime, credentials are loaded into process environment from the encrypted auth vault.
- You can still provide credentials through shell environment variables if preferred.
Using Ollama (Local Models)
Agent Vault supports Ollama for fully local inference — no API keys, no cloud calls.
1. Install Ollama
# macOS
brew install ollama
# or download from https://ollama.com/download2. Pull models
You need one model for answering questions and one for generating embeddings.
# Start the Ollama server
ollama serve
# Pull an answer model
ollama pull gpt-oss:20b
# Pull an embedding model
ollama pull embeddinggemmaOther popular combinations:
| Answer model | Embedding model | Notes |
|---|---|---|
| gpt-oss:20b | embeddinggemma | Recommended — strong reasoning + fast embeddings |
| llama3.2 | nomic-embed-text | Lightweight, good for smaller machines |
| mistral | mxbai-embed-large | Balanced quality and speed |
| qwen2:7b | nomic-embed-text | Good multilingual support |
3. Configure Agent Vault
agents-vault configureSelect Ollama (Local) when prompted. Agent Vault will auto-discover your pulled models:
? Select LLM provider: Ollama (Local)
? Ollama base URL: http://localhost:11434
Discovering local models...
? Select answer model: gpt-oss:20b
? Select embedding model: embeddinggemma
✔ Configuration savedNo API keys are needed — Ollama runs entirely on your machine.
4. Ingest and ask
agents-vault ingest --source ./my-docs --project my-project
agents-vault ask "What does this project do?" --project my-projectTroubleshooting Ollama
- Make sure
ollama serveis running before usingingestorask. - If embedding fails mid-ingest, retry with
--reindex. Large batches can occasionally cause Ollama's internal subprocess to restart. - To verify Ollama is reachable:
curl http://localhost:11434/api/tags - If using a non-default port or remote Ollama instance, specify the base URL during
agents-vault configure.
Supported Inputs
Ingestion supports:
txtmdpdfpng,jpg,jpeg,webp(stub image/OCR flow in v1)
Project Structure
apps/cli # CLI entrypoint and command handlers
packages/core # domain entities, ports, application services
packages/ingestion # discovery/parsing/chunking pipeline
packages/retrieval # context reduction and answer prompt assembly
packages/storage # sqlite vector store + local config/auth/export repos
packages/providers # OpenAI/Azure/Ollama providers + OCR/vision stubs
packages/shared # schemas, errors, utilitiesDevelopment
Workspace scripts:
pnpm build
pnpm typecheck
pnpm test
pnpm lintCLI package only:
pnpm --filter @agents-vault/cli build
pnpm --filter @agents-vault/cli testTroubleshooting
- Run
agents-vault doctorfirst for environment/config/storage checks. - If commands fail globally, relink:
cd apps/cli
pnpm link --global- If provider calls fail, rerun
agents-vault configureand re-enter credentials.
Roadmap
- Replace image/OCR stubs with production adapters.
- Add broader fixture-based integration coverage.
Contributing
Contributions are welcome. Before opening a PR, run:
pnpm typecheck
pnpm test
pnpm lintLicense
MIT — see LICENSE.
