@patate/hermes-docs-mcp
v0.2.0
Published
MCP server to search and read NousResearch's Hermes Agent documentation
Maintainers
Readme
Hermes Docs MCP Server
MCP server that provides semantic search and document retrieval for NousResearch Hermes Agent documentation.
Uses a local embedding model (nomic-embed-text-v1.5, Q6_K quantized, ~108 MB) loaded via node-llama-cpp — no external API keys or network calls needed at query time.
What it does
Two MCP tools are exposed to connected AI agents:
| Tool | Description |
|---|---|
| search_docs | Semantic search across all docs. Returns ranked chunks with similarity scores, file paths, and content. |
| get_document | Retrieves the full content of a specific doc file by path (e.g. user-guide/security.md). |
Install
Via npm (recommended)
npm install -g @patate/hermes-docs-mcpOr run directly with npx:
npx @patate/hermes-docs-mcpFrom source
git clone <repo> && cd hermes-doc
pnpm install
pnpm run setup # downloads model + syncs docs + builds DBThe setup step does three things:
- Downloads the embedding model from Hugging Face (~108 MB, cached in
models/) - Syncs docs from the Hermes Agent GitHub repo (~27 MB tarball, written to
docs/) - Builds the database — chunks every
.mdfile and generates embeddings (~15 min on M-series Mac)
MCP installation
Add the server to your MCP client config (e.g. ~/.config/<client>/mcp.json or your project's .mcp.json):
From npm
{
"mcpServers": {
"hermes-docs": {
"command": "npx",
"args": ["-y", "@patate/hermes-docs-mcp"]
}
}
}From source
{
"mcpServers": {
"hermes-docs": {
"command": "pnpm",
"args": ["mcp"],
"cwd": "<path-to-hermes-doc>"
}
}
}Replace <path-to-hermes-doc> with the absolute path to this repo.
That's it — the MCP server auto-boots on first connection: if the model, docs, or database are missing, it downloads and builds them automatically.
Environment variables
| Variable | Values | Default | Description |
|---|---|---|---|
| HERMES_DOCS_MODE | cpu, gpu, auto | auto | Controls GPU offloading for embeddings. cpu forces CPU-only (gpuLayers=0), gpu pushes all layers to GPU (gpuLayers=max), auto lets the runtime decide. |
{
"mcpServers": {
"hermes-docs": {
"command": "pnpm",
"args": ["mcp"],
"cwd": "<path-to-hermes-doc>",
"env": {
"HERMES_DOCS_MODE": "cpu"
}
}
}
}CLI tools
Besides the MCP server, standalone CLI tools are available (from source):
| Command | Description |
|---|---|
| pnpm run setup | Full setup: download model, sync docs, build DB |
| pnpm query "how to deploy" | One-shot semantic search from terminal |
| pnpm run sync-docs | Refresh docs from GitHub |
| pnpm run build:db | Rebuild embeddings (e.g. after doc refresh) |
Publishing
npm login
npm publish --access publicRequirements
- Node.js 20+
- Xcode Command Line Tools (for native dependencies:
better-sqlite3,node-llama-cpp)
