@deepsweet/mdn

v0.4.1

Published

2 months ago

Offline-first [MDN Web Docs](https://developer.mozilla.org/) RAG-MCP server ready for semantic search with hybrid vector (1024-d) and full‑text (BM25) retrieval.

Downloads

126

0High
0Medium
0Low

deepsweet

Offline-first MDN Web Docs RAG-MCP server ready for semantic search with hybrid vector (1024-d) and full‑text (BM25) retrieval.

Example

example screenshot

Content

The dataset covers the core MDN documentation sections, including:

Web API
JavaScript
HTML
CSS
SVG
HTTP

See dataset repo on HuggigFace for more details.

Usage

1. Download dataset and embedding model

npx -y @deepsweet/mdn@latest download

Both dataset (~260 MB) and the embedding model GGUF file (~438 MB) will be downloaded directly from HugginFace and stored in its default cache location (typically ~/.cache/huggingface/), just like the hf download command does.

2. Setup RAG-MCP server

{
  "mcpServers": {
    "mdn": {
      "command": "npx",
      "args": [
        "-y",
        "@deepsweet/mdn@latest",
        "server"
      ],
      "env": {}
    }
  }
}

[!TIP] Remove @latest for a full offline experience, but keep in mind that this will cache a fixed version without auto-updating.

The stdio server will spawn llama.cpp under the hood, load the embedding model (~655 MB RAM/VRAM), and query the dataset – all on demand.

Settings

| Env variable | Default value | Description | |----------------------------|-----------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------| | MDN_DATASET_PATH | HuggingFace cache | Custom dataset directory path | | MDN_MODEL_PATH | HuggingFace cache | Custom model file path | | MDN_MODEL_TTL | 1800 | For how long llama.cpp with embedding model should be kept loaded in memory, in seconds; 0 to prevent unloading | | MDN_QUERY_DESCRIPTION | Natural language query for hybrid vector and full-text search | Custom search query description in case your LLM does a poor job asking the MCP tool | | MDN_SEARCH_RESULTS_LIMIT | 3 | Total search results limit | | HF_TOKEN | | Optional HuggingFace access token, helps with occasional "HTTP 429 Too Many Requests" |

To do

[x] automatically update and upload the dataset artifacts monthly with GitHub Actions
[x] automatically prune old dataset revisions like hf cache prune
[ ] figure out a better query description so that LLM doesn't over-generate keywords

Articles

Парсим MDN и пишем оффлайн RAG-MCP (in Russian)

License

The RAG-MCP server itself and the processing scripts are available under MIT.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme