openclaw-kb
v1.2.0
Published
Self-updating vector knowledge base for OpenClaw — semantic search over docs, code, and configs with automatic release tracking
Downloads
194
Maintainers
Readme
OpenClaw Knowledge Base
Self-updating vector knowledge base for OpenClaw — semantic search over docs, code, and configs with automatic release tracking.
Install
Via mise (recommended)
# mise.toml
[tools]
"npm:openclaw-kb" = "latest"
[env]
UPSTREAM_DIR = "{{ config_root }}/build/openclaw/.upstream"
KB_DATA_DIR = "{{ config_root }}/data/kb"For development
git clone https://github.com/mrkhachaturov/openclaw-kb.git
cd openclaw-kb
just setup # npm installUsage
# Search documentation
openclaw-kb docs "sandbox configuration"
# Search source code
openclaw-kb code "TtsProviderSchema"
# Cross-reference docs + code
openclaw-kb verify "session persistence"
# Full query with filters
openclaw-kb query "webhook" --ios --json --top 5
# Offline keyword search (no API key needed)
openclaw-kb query "telegram" --offline
# Sync to latest upstream release
openclaw-kb sync --upstream-dir /path --data-dir /path
# Database info
openclaw-kb stats
openclaw-kb latest
openclaw-kb history
openclaw-kb since v2026.2.21
# Install auto-sync timer
openclaw-kb install-service --interval 2h \
--upstream-dir /path/.upstream \
--data-dir /path/data \
--env-file /path/.envCLI Reference
| Command | Description |
|---------|-------------|
| query <text> | Hybrid search (vector + keyword + RRF fusion) |
| docs <text> | Search documentation only |
| code <text> | Search source code only |
| skills <text> | Search skills only |
| verify <text> | Two-pass: docs then related code |
| index | Reindex the knowledge base |
| sync | Fetch latest upstream tag and reindex |
| stats | Show database statistics |
| latest | Show current indexed version |
| history | Show last 10 releases |
| since <version> | What changed since a version |
| install-service | Generate systemd/launchd auto-sync |
Query flags
| Flag | Effect |
|------|--------|
| --docs | Filter to documentation |
| --code | Filter to source code |
| --skills | Filter to skills |
| --ios | Filter to iOS Swift sources |
| --macos | Filter to macOS Swift sources |
| --shared | Filter to shared Swift package |
| --releases | Filter to release notes |
| --verify | Two-pass: docs then code |
| --json | Machine-readable JSON output |
| --top <n> | Number of results (default: 8) |
| --offline | FTS-only keyword search, no API key needed |
Exit codes
| Code | Meaning |
|------|---------|
| 0 | Success |
| 1 | Runtime error |
| 2 | Configuration error |
| 3 | No results found |
Environment Variables
| Variable | Purpose | Default |
|----------|---------|---------|
| UPSTREAM_DIR | OpenClaw source checkout | ./source |
| KB_DATA_DIR | SQLite DB + logs directory | ./data |
| OPENAI_API_KEY | For embedding generation | required (openai provider) |
| KB_EMBEDDING_MODEL | OpenAI embedding model | text-embedding-3-small |
| KB_EMBEDDING_PROVIDER | openai or local | openai |
| KB_LOCAL_MODEL | ONNX model for local provider | all-MiniLM-L6-v2 |
| KB_LOG_DIR | Override log directory | $KB_DATA_DIR/log |
Precedence: CLI flags → env vars → defaults.
Local Embedding (no API key)
# Install optional dependency
npm install @huggingface/transformers
# Index and query fully offline
KB_EMBEDDING_PROVIDER=local openclaw-kb index --force
KB_EMBEDDING_PROVIDER=local openclaw-kb query "sandbox" --docsA DB built with local embeddings is not compatible with openai embeddings (different dimensions). Switching providers requires a full reindex.
Multi-Machine Setup
Run sync on one machine, distribute the DB file via Syncthing/rsync/S3:
Server (cron): openclaw-kb sync → updates upstream.db
│
│ file sync
▼
Client (read-only): openclaw-kb query --offlineMetadata commands (stats, latest, history, since) work without API key. Use --offline for queries without an API key.
How It Works
- Indexing: Discovers files via glob patterns → chunks by semantic boundaries → embeds via OpenAI or local ONNX → stores in SQLite with sqlite-vec
- Querying: Expands query with synonyms → embeds query → hybrid search (vector similarity + BM25 keyword) → RRF fusion ranking
- Syncing: Fetches upstream git tags → diffs KB-relevant files → checks out new tag → reindexes changed files
Performance
- Index time: ~10-15 min (~7700 chunks from 2500 files)
- Query time: <100ms
- Storage: ~80MB SQLite
- Cost: ~$0.15/reindex (OpenAI) or free (local)
MCP Server
Expose the KB as an MCP server for native AI tool integration (Claude Desktop, Cursor, Claude Code):
{
"mcpServers": {
"openclaw-kb": {
"command": "openclaw-kb",
"args": ["mcp-serve"],
"env": {
"KB_DATA_DIR": "/path/to/data",
"UPSTREAM_DIR": "/path/to/upstream"
}
}
}
}Available tools: search, search_docs, search_code, search_skills, search_ios, get_stats, get_latest, get_history, get_since
AI Agent Integration
See AGENTS.md for integration with Claude Code, Cursor, and custom agents.
License
MIT
