obsidian-hybrid-search
v0.9.9
Published
An [MCP server](https://modelcontextprotocol.io) and CLI tool that makes your Obsidian vault queryable by AI assistants. Indexes notes into SQLite with FTS5 full-text search, trigram fuzzy matching, and `sqlite-vec` vector similarity — results are merged
Downloads
5,987
Readme
Obsidian Hybrid Search
An MCP server and CLI tool that makes your Obsidian vault queryable by AI assistants. Indexes notes into SQLite with FTS5 full-text search, trigram fuzzy matching, and sqlite-vec vector similarity — results are merged with Reciprocal Rank Fusion (RRF) and scored 0–1.
Once connected, any MCP-compatible AI assistant can answer questions grounded in your actual notes: finding knowledge by meaning, exact phrase, or title; traversing the wikilink graph; filtering by tag or folder; always citing the source note. No guessing from training data, no manual copy-paste.
No external services required. A bundled @huggingface/transformers model handles embeddings locally by default. Any OpenAI-compatible API (OpenRouter, Ollama, LM Studio) works as a drop-in replacement.
Features
- Hybrid search
- BM25 + fuzzy title + semantic embeddings, fused with RRF
- Alias search
- notes with
aliases:in frontmatter are indexed and searchable by any alias; alias matches are boosted in BM25 (weight 5×) and fuzzy title scoring
- notes with
- Four search modes
hybrid,semantic,fulltext,title(for text queries)
- Similar note lookup
- pass
--pathto find semantically related notes (always semantic, uses title + content)
- pass
- Graph traversal
--path --relatedshows linked notes at configurable depth; filter by--direction outgoing|backlinks|both
- Links & backlinks
- every result includes outgoing links and backlinks
- Scope filtering
- restrict to subfolder(s); supports multiple values and exclusions (
-notes/dev/)
- restrict to subfolder(s); supports multiple values and exclusions (
- Tag filtering
- filter by tag(s); supports multiple values and exclusions (
-category/cs)
- filter by tag(s); supports multiple values and exclusions (
- Snippet control
--snippet-lengthsets the context window; empty snippets always fall back to note content
- Extended output
--extendedadds a TAGS/ALIASES column to the CLI table showing frontmatter tags (#tag) and aliases
- Incremental indexing
- only re-indexes changed files; watches for edits in real time
- Multi-query fan-out
- pass multiple queries at once (
ohs search "q1" "q2"orqueries[]in MCP); results are merged via RRF — a note that ranks well in any one query floats to the top; useful when the note may use different vocabulary than the query
- pass multiple queries at once (
- Cross-encoder reranking
--rerankre-scores results withbge-reranker-v2-m3(ONNX int8, ~570 MB download once); improves precision for conceptual and multilingual queries; applied after multi-query merge
- Local embeddings
- works offline via
@huggingface/transformers(no API key required); default model: Xenova/multilingual-e5-small, 100+ languages
- works offline via
- Remote embeddings
- OpenAI-compatible API (OpenRouter, Ollama, etc.)
- Note reading
readfetches one or more notes by vault-relative path; returns full content with title, aliases, tags, links, and backlinks; on path miss returns top-3 fuzzy suggestions
- Ignore patterns
- exclude folders, extensions, or specific files
Installation
npm install -g obsidian-hybrid-search
# or run directly without installing:
npx obsidian-hybrid-searchCLI usage
The tool auto-discovers the database by walking up from the current directory looking for .obsidian-hybrid-search.db. The simplest way to use it — run from inside your vault:
cd /path/to/your/vault
obsidian-hybrid-search "zettelkasten"Two ways to search
| Scenario | How | Modes |
| --------------- | ------------------------------------------------------------------- | --------------------------------------------------- |
| Text query | obsidian-hybrid-search "some topic" | hybrid (default), semantic, fulltext, title |
| Similar notes | obsidian-hybrid-search --path notes/pkm/zettelkasten.md | Always semantic (title + content) |
| Graph traversal | obsidian-hybrid-search --path notes/pkm/zettelkasten.md --related | Links & backlinks via BFS |
--mode only affects text queries. When --path is given, the search is always semantic regardless of --mode.
If you want to run it from anywhere (e.g. via shell aliases), set the environment variables explicitly. Add to your ~/.zshrc or ~/.bashrc:
export OBSIDIAN_VAULT_PATH="/path/to/your/vault"
# Optional — only needed if using a remote embedding API instead of local model
export OPENAI_API_KEY="sk-..."
# Default API base is https://api.openai.com/v1 — override for other providers:
# export OPENAI_BASE_URL="https://openrouter.ai/api/v1" # OpenRouter
# export OPENAI_BASE_URL="http://localhost:11434/v1" # Ollama (no key needed)
# export OPENAI_BASE_URL="http://localhost:1234/v1" # LM Studio (no key needed)
export OPENAI_EMBEDDING_MODEL="text-embedding-3-small"Without OPENAI_API_KEY the local Xenova/multilingual-e5-small model is used automatically — works offline, no API key needed. Downloads ~117 MB on first run. Supports 100+ languages including Russian, Chinese, Japanese, and more.
# Hybrid search (default)
obsidian-hybrid-search "zettelkasten atomic notes"
# Fulltext BM25 search
obsidian-hybrid-search "permanent notes" --mode fulltext
# Fuzzy title search (fast, typo-tolerant)
obsidian-hybrid-search "zettleksten" --mode title
# Semantic / vector search
obsidian-hybrid-search "how to build a knowledge graph" --mode semantic
# Limit results and set a score threshold
obsidian-hybrid-search "productivity systems" --limit 5 --threshold 0.3
# Restrict to a subfolder
obsidian-hybrid-search "daily review" --scope notes/periodic/
# Restrict to multiple subfolders (OR)
obsidian-hybrid-search "productivity" --scope notes/pkm/ --scope notes/projects/
# Exclude a subfolder
obsidian-hybrid-search "programming" --scope notes/ --scope -notes/archive/
# Filter by tag
obsidian-hybrid-search "productivity" --tag pkm
obsidian-hybrid-search "machine learning" --tag note/basic/primary
# Filter by multiple tags (OR include, exclude with -)
obsidian-hybrid-search "learning" --tag pkm --tag -draft
# Find semantically similar notes
obsidian-hybrid-search --path notes/pkm/zettelkasten.md
# Graph traversal: show notes linked to/from this note
# Results show depth: -1/-2 = backlinks, 0 = source, +1/+2 = outgoing links
obsidian-hybrid-search --path notes/pkm/zettelkasten.md --related
obsidian-hybrid-search --path notes/pkm/zettelkasten.md --related --depth 2
# Only outgoing links (what this note references)
obsidian-hybrid-search --path notes/pkm/zettelkasten.md --related --direction outgoing
# Only backlinks (who references this note)
obsidian-hybrid-search --path notes/pkm/zettelkasten.md --related --direction backlinks
# Longer context around each link
obsidian-hybrid-search --path notes/pkm/zettelkasten.md --related --snippet-length 500
# Rerank results with a cross-encoder model (improves precision, ~1-3s extra latency)
# Downloads bge-reranker-v2-m3 ONNX (~570 MB) on first use, cached in ~/.cache/huggingface/
obsidian-hybrid-search "zettelkasten atomic notes" --rerank
# Show tags and aliases alongside results
obsidian-hybrid-search "zettelkasten" --extended
# JSON output (for scripting)
obsidian-hybrid-search "spaced repetition" --json
# Open results in Obsidian (each in a new tab)
obsidian-hybrid-search "zettelkasten" --open
# Reindex the vault
obsidian-hybrid-search reindex
# Force full reindex
obsidian-hybrid-search reindex --force
# Reindex a single file
obsidian-hybrid-search reindex notes/pkm/zettelkasten.md
# Show indexing status
obsidian-hybrid-search status
# Read a note by path (outputs raw content, like cat)
obsidian-hybrid-search read notes/pkm/zettelkasten.md
# Read multiple notes (separator between each)
obsidian-hybrid-search read notes/pkm/zettelkasten.md notes/pkm/evergreen-notes.md
# Cap content length
obsidian-hybrid-search read notes/pkm/zettelkasten.md --snippet-length 2000
# Structured output with all metadata
obsidian-hybrid-search read notes/pkm/zettelkasten.md --jsonShell aliases
Add to your ~/.zshrc or ~/.bashrc for quick access:
alias ohs='obsidian-hybrid-search'
alias ohss='obsidian-hybrid-search --mode semantic'
alias ohst='obsidian-hybrid-search --mode title'
alias ohsf='obsidian-hybrid-search --mode fulltext'
alias ohsi='obsidian-hybrid-search reindex'
alias ohsst='obsidian-hybrid-search status'Then reload (source ~/.zshrc) and use:
ohs "zettelkasten" # hybrid search
ohss "how to build a knowledge graph" # semantic
ohst "zettelkasten" # fuzzy title (typo-tolerant)
ohsf "permanent notes" # fulltext BM25
ohsi # reindex vault
ohsst # show statusOutput example
Hybrid search returns a table with scores and snippets:
┌───────┬───────────────────────────────┬────────────────────────────────────────────┐
│ SCORE │ PATH │ SNIPPET │
├───────┼───────────────────────────────┼────────────────────────────────────────────┤
│ 0.98 │ notes/pkm/zettelkasten.md │ A note-taking method developed by Niklas │
│ │ │ Luhmann. Each note contains one atomic... │
├───────┼───────────────────────────────┼────────────────────────────────────────────┤
│ 0.72 │ notes/pkm/evergreen-notes.md │ Evergreen notes are written to evolve over │
│ │ │ time. Unlike fleeting notes, they are... │
└───────┴───────────────────────────────┴────────────────────────────────────────────┘With --extended, a TAGS/ALIASES column is added. Tags are prefixed with #, aliases are shown as-is:
┌───────┬───────────────────────────────┬──────────────────┬──────────────────────────────┐
│ SCORE │ PATH │ TAGS/ALIASES │ SNIPPET │
├───────┼───────────────────────────────┼──────────────────┼──────────────────────────────┤
│ 0.98 │ notes/pkm/zettelkasten.md │ #pkm │ A note-taking method... │
│ │ │ ЗК │ │
│ │ │ slip-box │ │
├───────┼───────────────────────────────┼──────────────────┼──────────────────────────────┤
│ 0.72 │ notes/pkm/evergreen-notes.md │ #pkm │ Evergreen notes are written │
│ │ │ #writing │ to evolve over time... │
└───────┴───────────────────────────────┴──────────────────┴──────────────────────────────┘Title mode omits the snippet column automatically.
MCP server
Most AI assistants operate without access to your personal knowledge — they can only work with what you paste into the conversation. Adding this server gives any MCP-compatible assistant a persistent, searchable index of your entire vault. It becomes a tool call, not a copy-paste session: the assistant queries your notes the same way it calls any other tool, gets ranked results with snippets and links, and can navigate your knowledge graph on request.
Add to your MCP config (.mcp.json, claude_desktop_config.json, or equivalent for your client).
Minimal config (local embeddings, no API key)
Uses the built-in Xenova/multilingual-e5-small model — works fully offline, supports 100+ languages. Downloads ~117 MB on first run.
{
"mcpServers": {
"obsidian-hybrid-search": {
"command": "npx",
"args": ["-y", "-p", "obsidian-hybrid-search@latest", "obsidian-hybrid-search-mcp"],
"env": {
"OBSIDIAN_VAULT_PATH": "/path/to/your/vault"
}
}
}
}Full config (OpenRouter)
{
"mcpServers": {
"obsidian-hybrid-search": {
"command": "npx",
"args": ["-y", "-p", "obsidian-hybrid-search@latest", "obsidian-hybrid-search-mcp"],
"env": {
"OBSIDIAN_VAULT_PATH": "/path/to/your/vault",
"OBSIDIAN_IGNORE_PATTERNS": ".obsidian/**,templates/**,*.canvas",
"OPENAI_API_KEY": "sk-or-v1-...",
"OPENAI_BASE_URL": "https://openrouter.ai/api/v1",
"OPENAI_EMBEDDING_MODEL": "openai/text-embedding-3-small"
}
}
}
}Note: On first run,
npxwill install the package automatically. Ignore patterns are persisted in the database and restored on every subsequent startup even if the env var is missing.
The server exposes four tools:
| Tool | Description |
| --------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| search | Search the vault. Use query for text search (mode: hybrid/semantic/fulltext/title) or path for semantic similarity. Combine path with related: true for graph traversal. Pass queries[] for multi-query fan-out (parallel search, RRF merge). Supports scope, tag, limit, threshold, depth, direction, snippet_length, rerank |
| read | Fetch one or more notes by vault-relative path. Returns full content, title, aliases, tags, links, and backlinks. On path miss: returns found: false with top-3 fuzzy suggestions. Accepts a single path or an array. Use snippet_length to cap content size |
| reindex | Reindex the vault or a specific file |
| status | Show total notes, indexed count, last indexed time |
Configuration
| Environment variable | Default | Description |
| -------------------------- | ------------------------------------ | ---------------------------------------------------------------------------------- |
| OBSIDIAN_VAULT_PATH | (required) | Absolute path to your vault |
| OBSIDIAN_IGNORE_PATTERNS | .obsidian/**,templates/**,*.canvas | Comma-separated ignore patterns |
| OPENAI_API_KEY | — | API key; omit to use local model embeddings or keyless servers (Ollama, LM Studio) |
| OPENAI_BASE_URL | https://api.openai.com/v1 | API base URL |
| OPENAI_EMBEDDING_MODEL | text-embedding-3-small | Embedding model name |
Ignore patterns
folder/**— ignore a directory and all its contents*.canvas— ignore by extensionexact/path.md— ignore a specific file
The ignore configuration is persisted in the database, so it is restored automatically even if the environment variable is missing on restart.
How it works
- Indexing — notes are chunked by headings (with sliding-window fallback), embedded, and stored in SQLite with FTS5 and
sqlite-vec. - Search — BM25 (with column weights: title 10×, aliases 5×, content 1×), fuzzy trigram title/alias search, and vector KNN search run in parallel; results are fused with RRF and scored 0–1 (higher = more relevant).
- Links — wikilinks (
[[note]]) are resolved to note paths and stored; every search result includeslinksandbacklinksarrays. - Watcher —
chokidarwatches for file changes and incrementally re-indexes in the background.
Development
npm install
npm test # run test suite
npm run build # compile TypeScriptTests use fake embeddings (no API key required) and run against a temporary vault. All tests cover chunking, BM25 scoring, fuzzy search, links/backlinks, tag filtering, scope filtering, related-mode traversal, direction/score logic, snippet fallback, and ignore pattern matching.
License
MIT
