markdown-docs-mcp

v0.1.5

Published

2 months ago

MCP server for efficient navigation of large markdown documents (datasheets, standards, manuals)

0High
0Medium
0Low

hacker-cb

mcp model-context-protocol markdown documentation datasheet

markdown-docs-mcp

MCP server for efficient navigation of large markdown documents — datasheets, IEC/ISO standards, reference manuals.

Lets agents read what they need from a 100 000+ line markdown file without dumping the whole thing into context.

Tools

view_toc — get document structure (TOC with line ranges, sizes, anomaly hints; auto-trims to fit a configurable byte cap).
read_section — fetch a single section by opaque id; raw and logical reading modes; comment stripping; continuation via from_line.
search — literal or regex search across titles and content with section context.
analyze_document — diagnostic report on structural anomalies (self-nesting, level jumps, PDF-conversion artifacts, etc.).

Installation

Claude Code plugin

Run inside a Claude Code session:

/plugin marketplace add hacker-cb/markdown-docs-mcp
/plugin install markdown-docs@hacker-cb

The plugin installs into user scope by default. To install for a team-shared project instead (writes .claude/settings.json in the project), pass --scope project to the second command:

/plugin install markdown-docs@hacker-cb --scope project

Direct MCP config (any MCP-compatible client)

Add to ~/.claude/settings.json, project .mcp.json, Cursor settings, Continue config, etc.:

{
  "mcpServers": {
    "markdown-docs": {
      "command": "npx",
      "args": ["-y", "markdown-docs-mcp@latest"]
    }
  }
}

Example prompt

Once installed, point the agent at a large markdown file and ask anything:

Explain me peripherals of @tests/fixtures/public/esp32-p4-trm.md

The agent will call view_toc to skim the 143k-line manual, drill into relevant sections via read_section, and answer from a focused subset — without ever loading the full file into its context.

The @<path> syntax is Claude Code's file-attachment shortcut. In Cursor / Continue / other MCP clients, pass the absolute or workspace-relative path as plain text instead.

Configuration

Two optional env vars override response caps (see src/config.ts):

| Variable | Default | Ceiling | Purpose | | ----------------------------------- | -------- | ------- | ---------------------------------------- | | MARKDOWN_DOCS_MAX_TOC_BYTES | 51 200 | 500 000 | Cap on view_toc response payload | | MARKDOWN_DOCS_MAX_SECTION_BYTES | 204 800 | 500 000 | Cap on read_section.content |

Invalid values warn to stderr and fall back to default; the server does not refuse to start.

Known limitations

Issues identified during the pre-release code review but deliberately deferred — they have not surfaced on real workloads so far, and fixing them prematurely would add complexity for no current benefit. Revisit when observed in practice:

Regex DoS surface in search. A user-supplied regex (regex: true) is evaluated against every body line of the indexed document (up to 143k lines / 5 MB on the largest fixture). A pathological pattern with nested quantifiers ((a+)+$ and similar) can hang the server thread with no way for the MCP client to cancel it. Mitigation when needed: depend on re2 (linear-time engine) or wrap .exec in a setImmediate-paced loop with a wall-clock budget.
Concurrent getOrBuild race in the LRU cache. Two near-simultaneous tool calls on a freshly-evicted (or fresh) file both miss, both run buildIndex end-to-end (90–120 s on the largest fixture), and the second result overwrites the first. Stdio MCP transports are typically single-flight per session so this is mostly theoretical today, but parallel-aware clients (Promise.all([...])) can hit it. Fix: store the in-flight Promise<Index> in a Map<string, Promise<Index>> keyed by the resolved path.
stripComments is O(lines × ranges). For each comment range the function builds a per-line Set and probes it line-by-line. On a section containing thousands of  markers this becomes the dominant cost of read_section. Today it stays cheap because the section byte cap keeps the inner loop small; if MARKDOWN_DOCS_MAX_SECTION_BYTES is raised toward the 500 KB ceiling it will degrade. Fix: sort ranges once at index-build time and use a two-pointer walk, or precompute a boolean line-flag array.

Not in scope (MVP)

Deliberate omissions — out of scope for the current MCP, not bugs:

Semantic search (embeddings). search is grep-like by design.
Table parsing into JSON / structured form.
Image parsing / OCR.
On-disk index cache (in-memory LRU is enough for the realistic working set).
A numbering_filter parameter on view_toc — add it if a real workflow needs it.
Performance benchmarks as a CI gate (smoke tests only).
Wrappers for other platforms (Cursor extension, Copilot CLI plugin, Gemini CLI extension, etc.). The MCP server already works with any MCP-compatible client over stdio; a platform-specific wrapper is separate work, done on demand.

Development

pnpm install
pnpm test
pnpm build

Releasing

Versions live in four places that must stay in lockstep: package.json, .claude-plugin/plugin.json, .claude-plugin/marketplace.json (plugins[0].version), and .mcp.json (args[1]). scripts/release.mjs bumps them atomically and creates a release: vX.Y.Z commit plus a vX.Y.Z tag. From master:

pnpm release --dry-run 0.2.0    # preview diff
pnpm release 0.2.0              # apply + commit + tag
git push --follow-tags origin master

Flag forwarding note: pnpm 10 (this repo's pinned version) forwards unknown flags like --dry-run to the script. If you use a different package manager that intercepts the flag, insert -- to disambiguate (pnpm release -- --dry-run 0.2.0) or invoke the script directly (node scripts/release.mjs --dry-run 0.2.0).

The release.yml workflow runs on v* tags: full test matrix → build → npm publish --provenance --access public via npm Trusted Publisher (OIDC, no NPM_TOKEN needed) → GitHub Release with auto-generated notes. Trusted Publisher must be configured once on npmjs.com under the maintainer account.

License

MIT — see LICENSE.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

markdown-docs-mcp

Tools

Installation

Claude Code plugin

Direct MCP config (any MCP-compatible client)

Example prompt

Configuration

Known limitations

Not in scope (MVP)

Development

Releasing

License