docs-hub-mcp
v1.0.14
Published
MCP server for documentation hub — sync from Wiki, Slack, Google Docs, Notion, Confluence to .md, git versioned, with 3-tier search
Downloads
163
Readme
doc-hub-mcp
MCP server for documentation hub — sync knowledge from multiple sources into version-controlled .md files with 3-tier search, designed for AI agents.
Overview
AI agents need fast, reliable access to team documentation. But docs are scattered across Wiki, Slack, Google Docs, Notion, and Confluence. doc-hub-mcp solves this by:
- Syncing documentation from multiple sources into
.mdfiles - Indexing with an auto-generated knowledge graph
- Serving via MCP with 3-tier search (Exact → Graph → Semantic)
Market Differentiator
| | codegraph | agentmemory | coral | doc-hub-mcp |
|---|---|---|---|---|
| Target | Code | Agent memory | APIs | Documentation |
| Input | Source code | Agent calls | SQL | Wiki, Slack, Docs, Notion, Confluence |
| Format | SQLite | Internal | Tables | .md + git |
| Auto-sync | File watch | Hook | Manual | Scheduled / incremental |
| Search | Symbol graph | Semantic hybrid | SQL | 3-tier (5ms → 20ms → 1s) |
Features
- 5 sync adapters — Git Wiki, Google Docs, Slack, Notion, Confluence →
.md - 3-tier search — Exact (ripgrep, ~5ms) → Knowledge Graph (YAML, ~20ms) → AI Semantic (TF-IDF, ~25ms)
- Incremental sync — Only sync changes since last run, with per-source state tracking
- Git versioned — Auto-commit after each sync, full history and audit trail
- MCP server — 3 tools:
search_knowledge,read_document,get_document_structure - Web UI — Browser dashboard for search, browse, and sync management
- LRU cache — Repeat queries return in <1ms
- Zero external dependencies — No vector database (Pinecone/Weaviate/Chroma), no GPU required
Quick Start
# Install globally
npm install -g doc-hub-mcp
# Initialize a new project
dhm init
# Edit config.json to add your sources, then sync
dhm sync
# Build the search index
dhm index
# Run a search
dhm search "how to deploy to production"
# Start the Web UI
dhm web
# Or start the MCP server for AI agents
dhm serveCLI Reference
dhm init # Create config.json + knowledge-base/
dhm sync [--incremental] # Sync all sources (or incremental)
dhm index # Rebuild .dhm-index.yaml
dhm search "query" # Run 3-tier search
--tier 1|2|3 # Force tier
--max 10 # Max results
--json # JSON output
dhm serve # Start MCP server
dhm web # Start Web UI (http://localhost:3456)
dhm prewarm # Pre-warm cacheConfiguration
{
"knowledgeBase": "./knowledge-base",
"maxResults": 10,
"minResults": 3,
"cache": { "maxSize": 100, "ttl": 3600 },
"prewarm": { "queries": ["deploy", "api", "config"] },
"sources": [
{ "type": "git-wiki", "name": "company-wiki", "enabled": true, "url": "https://github.com/company/wiki.git", "branch": "main" },
{ "type": "google-docs", "name": "team-docs", "enabled": false, "credentialsFile": "./credentials.json", "folderId": "xxx" },
{ "type": "slack", "name": "engineering-slack", "enabled": false },
{ "type": "notion", "name": "product-docs", "enabled": false, "url": "your-database-id" },
{ "type": "confluence", "name": "company-confluence", "enabled": false, "url": "https://company.atlassian.net/wiki", "folderId": "SPACEKEY" }
]
}MCP Integration
Add to Claude Desktop or Cursor mcp.json:
{
"mcpServers": {
"doc-hub-mcp": {
"command": "npx",
"args": ["doc-hub-mcp", "serve"],
"cwd": "/path/to/your/doc-hub"
}
}
}MCP Tools
| Tool | Description |
|---|---|
| search_knowledge | 3-tier search returning ranked results with snippets |
| read_document | Read full content of a document by relative path |
| get_document_structure | Browse directory tree, source list, and sync status |
Environment Variables
| Variable | Adapter | Required |
|---|---|---|
| SLACK_BOT_TOKEN | Slack | Yes |
| NOTION_TOKEN | Notion | Yes |
| CONFLUENCE_URL | Confluence | Yes |
| CONFLUENCE_API_TOKEN | Confluence | Yes |
| GOOGLE_ACCESS_TOKEN | Google Docs | Yes |
| DHM_LOG_LEVEL | All | No (default: info) |
| DHM_PORT | Web UI | No (default: 3456) |
Docker
docker compose up -dSearch Architecture
Agent query
│
├── Tier 1: Exact Match (ripgrep) ~5ms ← 80% of queries
│ ↓ miss
├── Tier 2: Knowledge Graph (.dhm-index) ~20ms ← 15% of queries
│ ↓ still insufficient
└── Tier 3: AI Semantic (TF-IDF) ~25ms ← 5% of queries
│
▼
Deduplicate + Rank (Tier1=1.0, Tier2=0.8, Tier3=0.5)Project Structure
src/
├── cli/ # CLI commands + config loader (zod-validated)
├── mcp/ # MCP server + 3 tool implementations
├── search/ # 3-tier engine: exact, graph, semantic, cache, pipeline
├── index/ # YAML index builder + markdown parser
├── sync/ # 5 adapters: git-wiki, google-docs, slack, notion, confluence
├── web/ # Express server + browser UI
└── utils/ # FS, git, logger utilitiesDocumentation
- Quickstart Guide — Set up and first sync in 5 minutes
- Architecture Guide — Detailed design and data flow
- Contributing Guide — Conventions, testing, PR process
License
MIT
