@orellbuehler/paperless-mcp
v1.1.0
Published
Model Context Protocol server for Paperless-ngx: documents, organization, saved views, users, workflows, and optional semantic search.
Maintainers
Readme
paperless-mcp
MCP server for Paperless-ngx that exposes the REST API as tools for AI agents. Includes optional semantic search via local vector embeddings.
Install
The package is published as @orellbuehler/paperless-mcp and runs directly with npx — no clone or build needed:
claude mcp add paperless \
--env PAPERLESS_URL=https://your-paperless-instance.example.com \
--env PAPERLESS_TOKEN=your-api-token \
-- npx -y @orellbuehler/paperless-mcpSee Usage with Claude Code for the equivalent JSON config.
Semantic search is off by default. To enable it, set EMBEDDINGS_ENABLED=true; the better-sqlite3 and sqlite-vec native modules are installed automatically as optional dependencies (this requires a build toolchain on your platform). The core document tools work without them.
Setup (from source)
For local development, clone the repo and build the dist/ output:
npm install
npm run buildUsage with Claude Code
Get your API token from Paperless-ngx (Settings > Administration, or
POST /api/token/)Set the environment variables by editing
~/.claude/settings.json. Using the published package:
{
"mcpServers": {
"paperless": {
"command": "npx",
"args": ["-y", "@orellbuehler/paperless-mcp"],
"env": {
"PAPERLESS_URL": "https://your-paperless-instance.example.com",
"PAPERLESS_TOKEN": "your-api-token"
}
}
}
}If you built from source instead, use "command": "node" with "args": ["/path/to/paperless-mcp/dist/index.js"]. To enable semantic search, add "EMBEDDINGS_ENABLED": "true" (and "OPENAI_API_KEY" if using the OpenAI embedding provider).
Restart Claude Code. The tools will be available immediately.
If you enabled semantic search, run
sync_embeddingsto index your documents.
Updating
The server runs from the compiled dist/ output, so updating is just rebuild + restart — there's no need to re-run claude mcp add (the launch command and path don't change):
git pull # if you track a remote
npm install # only if dependencies changed
npm run build # recompile src/ -> dist/Then restart Claude Code (or your MCP client) so it re-spawns the server with the new build. Verify with claude mcp list (should show paperless ✓ connected) or run /mcp inside a session.
To change connection settings (URL, token, embedding provider), edit the env block in your config, or re-register the server:
claude mcp remove paperless
claude mcp add paperless --scope user \
--env PAPERLESS_URL=http://localhost:8000 \
--env PAPERLESS_TOKEN=your-api-token \
-- node /path/to/paperless-mcp/dist/index.jsRegenerating the API spec
paperless-openapi.yaml is the Paperless-ngx OpenAPI schema used as a reference when building tools. Pull a fresh copy straight from a running instance (no Docker needed):
PAPERLESS_URL=https://your-paperless-instance.example.com \
PAPERLESS_TOKEN=your-api-token \
npm run spec:updateThis fetches GET /api/schema/ and overwrites paperless-openapi.yaml. Run it whenever you upgrade Paperless-ngx.
Available Tools
Core API Tools
| Category | Tools |
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------- |
| Search | search_documents, search_autocomplete |
| Documents | list_documents, get_document, get_documents, download_document, update_document, delete_document, upload_document |
| Document details | get_document_metadata, get_document_suggestions, get_document_notes, add_document_note, delete_document_note |
| Bulk operations | bulk_edit_documents, bulk_set_object_permissions, get_next_asn |
| Correspondents | list_correspondents, get_correspondent, create_correspondent, update_correspondent, delete_correspondent |
| Document types | list_document_types, get_document_type, create_document_type, update_document_type, delete_document_type |
| Tags | list_tags, get_tag, create_tag, update_tag, delete_tag |
| Saved views | list_saved_views, get_saved_view, create_saved_view, update_saved_view |
| Storage paths | list_storage_paths, get_storage_path, create_storage_path, update_storage_path |
| Custom fields | list_custom_fields, get_custom_field, create_custom_field, update_custom_field |
| Users | list_users, get_user, create_user, update_user |
| Groups | list_groups, get_group, create_group, update_group |
| Paperless workflows | list_workflows, get_workflow, create_workflow, update_workflow |
| System | get_status, get_statistics, list_tasks |
Note:
list_documentsandsearch_documentsreturn document metadata only (no OCR text) to keep responses small. Useget_document(single) orget_documents(batch) to retrieve full content.Saved views, users/groups, and workflows support read + create + update only — no delete tools (use the Paperless web UI to delete). User management covers accounts and group membership; it does not set per-document permissions. Notes support add and delete only (no edit), so there is no note-editing tool.
The
update_*tools for tags, correspondents, document types, storage paths, saved views, and custom fields acceptownerandset_permissions({ view, change }→{ users, groups }) to share objects.bulk_set_object_permissionssets owner/permissions on many tags, correspondents, document types, or storage paths in one call (saved views and custom fields are not supported by the bulk endpoint — share those individually).
Extended Tools
| Category | Tools | Description |
| --------------- | ---------------------------------------------------------------------- | -------------------------------------------------------- |
| Semantic search | semantic_search, sync_embeddings, embedding_status | Vector similarity search using local sqlite-vec database |
| Content | get_document_content | Extract OCR'd text content from documents |
| Workflows | auto_classify_document, process_inbox, bulk_tag_by_content | AI-assisted classification and bulk operations |
| Helpers | get_documents_by_correspondent, monthly_summary, upload_from_url | Convenience tools for common workflows |
Environment Variables
| Variable | Required | Description |
| ---------------------- | --------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| PAPERLESS_URL | Yes | Base URL of your Paperless-ngx instance |
| PAPERLESS_TOKEN | Yes | API token. In stdio mode this is the user's token; in http mode it is the admin/indexer token (builds the shared embedding index and gates sync_embeddings) |
| MCP_TRANSPORT | No | stdio (default) or http |
| PORT | No | Port for the HTTP server (default: 3001, http mode only) |
| EMBEDDINGS_ENABLED | No | Set to true to enable semantic search tools (default: off) |
| MCP_ALLOWED_ORIGINS | No | Comma-separated Origin allowlist for browser clients (http mode). Empty (default) blocks all cross-origin browser requests; use * to allow any |
| MCP_ALLOWED_HOSTS | No | Comma-separated Host allowlist for DNS-rebinding protection (http mode). Empty (default) disables host validation |
| EMBEDDING_PROVIDER | No | openai or ollama (default: openai) |
| OPENAI_API_KEY | If using OpenAI | Required for OpenAI embeddings |
| OLLAMA_URL | If using Ollama | Ollama server URL (default: http://localhost:11434) |
| EMBEDDING_MODEL | No | Model name (defaults per provider) |
| EMBEDDING_DIMENSIONS | No | Vector dimensions (defaults per provider) |
| PAPERLESS_MCP_DATA | No | Directory for the vector DB (default: ~/.paperless-mcp) |
Transports
The server supports two transports, selected by MCP_TRANSPORT.
stdio (default)
Single-user. The MCP client launches the server as a subprocess and it uses
PAPERLESS_TOKEN for all requests. This is the configuration shown above.
HTTP (multi-user)
Run the server as a shared HTTP service (e.g. a sidecar next to your Paperless-ngx deployment) so other users on your network can connect:
MCP_TRANSPORT=http PORT=3001 \
PAPERLESS_URL=https://paperless.example.com \
PAPERLESS_TOKEN=<admin-token> \
node dist/index.jsClients connect to http://<host>:3001/mcp and authenticate with their own
Paperless API token via an Authorization: Bearer <token> header (or
X-Paperless-Token). Every Paperless call is made with that token, so each user
only sees the documents their account permits.
PAPERLESS_TOKEN is the admin/indexer token: it builds the shared semantic-search
index, and the sync_embeddings tool is only available to a session using the
admin token. semantic_search results are filtered through the requesting user's
token, so users never see documents they cannot access.
Non-browser MCP clients (which don't send an Origin header) work out of the box.
Browser-based clients are blocked unless you list their origin in
MCP_ALLOWED_ORIGINS. If the server is reachable on a public hostname, set
MCP_ALLOWED_HOSTS to the expected host(s) for DNS-rebinding protection.
Run as an HTTP sidecar (Docker)
A prebuilt image is published to the GitHub Container Registry on every release
and every push to main:
ghcr.io/orellbuehler/paperless-mcp:latest # tracks main
ghcr.io/orellbuehler/paperless-mcp:1 # latest 1.x release
ghcr.io/orellbuehler/paperless-mcp:1.0.0 # exact versionPull it directly:
docker pull ghcr.io/orellbuehler/paperless-mcp:latestAdd the server as a service next to your existing Paperless-ngx compose stack:
paperless-mcp:
image: ghcr.io/orellbuehler/paperless-mcp:latest
restart: unless-stopped
depends_on:
- webserver
ports:
- 3001:3001
volumes:
- /mnt/ssd/paperless_ngx/mcp:/data
environment:
MCP_TRANSPORT: http
PORT: 3001
PAPERLESS_URL: http://webserver:8000
PAPERLESS_TOKEN: <admin-token>
PAPERLESS_MCP_DATA: /data
EMBEDDINGS_ENABLED: "true"
EMBEDDING_PROVIDER: openai
OPENAI_API_KEY: <key>The image already defaults to MCP_TRANSPORT=http, PORT=3001, and
PAPERLESS_MCP_DATA=/data, so the only variables you must set are
PAPERLESS_URL and PAPERLESS_TOKEN (plus OPENAI_API_KEY when semantic
search is enabled). Mount a volume at /data to persist the embedding index
across restarts.
LAN clients connect to http://<host>:3001/mcp with their own Paperless API
token. Run sync_embeddings once with the admin token to build the shared
semantic index.
To build the image yourself instead of pulling it, a Dockerfile is included:
docker build -t paperless-mcp .