npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@florexlabs/docs-to-mcp

v0.2.2

Published

Convert any documentation URL into a ready-to-run MCP server — 100% local, no API keys needed

Readme

@florexlabs/docs-to-mcp

Convert any documentation URL into a ready-to-run MCP server.

100% local by default — no API keys needed. Embeddings run locally with Transformers.js.

URL → crawl → clean HTML → markdown → chunks → embeddings → vector store → MCP server

Prerequisites

  • Node.js >= 18
  • Docker (for ChromaDB): docker run -p 8000:8000 chromadb/chroma
  • Playwright browsers: npx playwright install chromium
  • No API keys needed for default local embeddings

Quick Start

# Install Playwright browsers (one-time)
npx playwright install chromium

# Start ChromaDB
docker run -p 8000:8000 chromadb/chroma

# Initialize a project from a docs URL
npx @florexlabs/docs-to-mcp init https://docs.example.com --out ./my-docs-to-mcp

cd my-docs-to-mcp
npm install

# Crawl, build, and start — no API keys needed!
npm run crawl
npm run build
npm run start

Installation

npm install -g @florexlabs/docs-to-mcp

Or use directly with npx:

npx @florexlabs/docs-to-mcp <command>

Embedding Providers

Local (default)

Uses Transformers.js with the Xenova/all-MiniLM-L6-v2 model. Runs 100% on your machine via ONNX runtime. No API keys, no external services, no cost.

docs-to-mcp build                                    # uses local by default
docs-to-mcp build --model Xenova/all-MiniLM-L6-v2    # explicit model

The model is downloaded automatically on first use (~80MB) and cached locally.

OpenAI (opt-in)

For higher quality embeddings on large documentation sets, you can use OpenAI:

export OPENAI_API_KEY=sk-...
docs-to-mcp build --provider openai
docs-to-mcp build --provider openai --model text-embedding-3-large

Commands

docs-to-mcp init <url>

Generate a new MCP server project from a documentation URL.

docs-to-mcp init https://docs.example.com --out ./my-docs-to-mcp

Options:

  • --out <dir> — Output directory (default: ./docs-to-mcp-project)
  • --depth <n> — Crawl depth (default: 3)
  • --limit <n> — Max pages (default: 50)
  • --provider <name> — Embedding provider: local or openai (default: local)
  • --model <name> — Embedding model
  • --collection <name> — Collection name (default: docs)

docs-to-mcp crawl <url>

Crawl a documentation site, parse HTML to markdown, and chunk it.

docs-to-mcp crawl https://docs.example.com --out ./data --depth 3 --limit 50

Options:

  • --out <dir> — Output directory (default: ./data)
  • --depth <n> — Crawl depth (default: 3)
  • --limit <n> — Max pages (default: 50)
  • --verbose — Verbose output

docs-to-mcp build

Embed chunks and upsert into ChromaDB.

docs-to-mcp build                          # local embeddings (default)
docs-to-mcp build --provider openai        # use OpenAI instead

Options:

  • --collection <name> — Collection name (default: docs)
  • --provider <name>local or openai (default: local)
  • --model <name> — Embedding model
  • --data <dir> — Data directory (default: ./data)
  • --force — Force rebuild
  • --verbose — Verbose output

docs-to-mcp start

Start the MCP server (stdio transport).

docs-to-mcp start --collection docs

docs-to-mcp dev

Start the MCP server in development mode with logging.

docs-to-mcp dev --collection docs

MCP Tools

The server exposes three tools:

| Tool | Description | |------|-------------| | search_docs(query, topK?) | Semantic search across indexed documentation | | get_source(url) | Get all chunks from a specific source URL | | list_sources() | List all indexed documentation sources |

Connecting to MCP Clients

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "my-docs": {
      "command": "npx",
      "args": ["@florexlabs/docs-to-mcp", "start", "--collection", "docs"],
      "env": {
        "CHROMA_URL": "http://localhost:8000"
      }
    }
  }
}

Cursor

Add to .cursor/mcp.json:

{
  "mcpServers": {
    "my-docs": {
      "command": "npx",
      "args": ["@florexlabs/docs-to-mcp", "start", "--collection", "docs"],
      "env": {
        "CHROMA_URL": "http://localhost:8000"
      }
    }
  }
}

Environment Variables

CHROMA_URL=http://localhost:8000

# Only needed with --provider openai:
OPENAI_API_KEY=sk-...
OPENAI_EMBEDDING_MODEL=text-embedding-3-small

Architecture

packages/
  cli/          — CLI commands (init, crawl, build, start, dev)
  crawler/      — Playwright-based same-origin doc crawler
  parser/       — HTML cleanup (Cheerio) + markdown conversion (Turndown)
  chunker/      — Heading-aware markdown chunking
  embeddings/   — Local (Transformers.js) + OpenAI providers
  vector-store/ — ChromaDB adapter
  mcp-server/   — MCP server with search tools

Security Notes

  • Only crawls same-origin links by default
  • Never executes scraped content
  • URLs are sanitized and normalized
  • Local embeddings stay on your machine — nothing leaves your network
  • If using OpenAI, embeddings are sent to OpenAI's API
  • Do not crawl private documentation unless you understand where data goes
  • No shell execution from user-controlled input

Development

pnpm install
pnpm test
pnpm build

License

MIT