npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@salesforcebob/sf-docs-mcp-server

v1.3.4

Published

MCP server for scraping Salesforce developer documentation and converting to Markdown. Use with Cursor, Claude Desktop, or any MCP client. Deploy to Heroku with one click.

Readme

SF Docs MCP Server

npm version Node.js License: MIT MCP Server

An MCP (Model Context Protocol) server for scraping Salesforce developer documentation and converting it to Markdown. Integrates with Cursor, Claude Desktop, and other MCP-compatible AI assistants. Deploy locally or to Heroku with one click.


What You Get

  • 🔍 Smart page analysis - Automatically detects optimal extraction strategy for any Salesforce doc page
  • 🕸️ Shadow DOM traversal - Handles React components and deeply nested shadow DOMs
  • 📄 Multiple page types - Supports guide, reference, API reference, type definitions, and landing pages
  • 🎯 Dynamic selectors - Fall back to custom selectors when automatic extraction fails
  • 📝 Clean Markdown output - Converts HTML to GFM-compatible Markdown with tables
  • 🚀 Heroku ready - One-click deploy for remote/hosted access

Table of Contents


Prerequisites

  • Node.js >= 18.0.0
  • Chrome/Chromium (installed automatically by Puppeteer)

Install

npm install -g @salesforcebob/sf-docs-mcp-server

Or use directly with npx (no installation required):

npx @salesforcebob/sf-docs-mcp-server

Run via npx

npx @salesforcebob/sf-docs-mcp-server

This starts an MCP stdio server. Use it with MCP-compatible clients like Cursor or Claude Desktop.


Using with Cursor

  1. Open Cursor settings → MCP/Servers
  2. Add a new stdio server:
{
  "mcpServers": {
    "sf-docs": {
      "command": "npx",
      "args": ["-y", "@salesforcebob/sf-docs-mcp-server"]
    }
  }
}

Or add to your Cursor MCP configuration file (~/.cursor/mcp.json).

  1. Save and reload tools. You should see:
    • scrape_sf_docs
    • analyze_page_structure

Using with Claude Desktop

Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "sf-docs": {
      "command": "npx",
      "args": ["-y", "@salesforcebob/sf-docs-mcp-server"]
    }
  }
}

Running Remotely (Heroku)

This server includes an Express HTTP transport for remote deployment.

One-Click Deploy to Heroku

Deploy to Heroku

After clicking Deploy:

  1. Choose an app name
  2. Deploy the app
  3. Verify endpoints:
    • GET /health{ ok: true }
    • GET /docs → Documentation JSON
    • POST /mcp → MCP HTTP endpoint

Local HTTP (for testing)

npm run serve
# or
npx @salesforcebob/sf-docs-mcp-server serve

Endpoints:

  • GET http://localhost:3000/health → Health check
  • GET http://localhost:3000/docs → Documentation
  • POST http://localhost:3000/mcp → MCP HTTP endpoint

Using with HTTP-capable MCP Clients

Point your client at <your-app-url>/mcp as the MCP HTTP endpoint.


Available Tools

scrape_sf_docs

Scrape a Salesforce documentation page and return the content as Markdown.

Input:

  • url (string, required): The Salesforce documentation URL to scrape
  • selector (string, optional): CSS selector for content container (light DOM only)
  • shadowPath (string[], optional): Array of selectors to traverse shadow DOM boundaries

Examples:

// Basic usage (automatic detection)
{
  "url": "https://developer.salesforce.com/docs/einstein/genai/guide/get-started.html"
}

// With shadow path for nested shadow DOM
{
  "url": "https://developer.salesforce.com/docs/commerce/einstein-api/references/einstein-profile-connector?meta=type:ClientIdParam",
  "shadowPath": ["doc-amf-reference", "doc-amf-topic", "api-type-documentation"]
}

analyze_page_structure

Analyze the DOM structure of a Salesforce documentation page to determine the best extraction approach. Use this first when the default scraper fails or returns empty content.

Input:

  • url (string, required): The Salesforce documentation URL to analyze

Output:

  • Detected page type
  • List of custom elements found
  • Elements with shadow DOM
  • Content containers with suggested selectors/shadow paths
  • Suggested extraction approach
  • DOM tree snapshot for debugging

Things You Can Ask

Here are examples of what you can ask your AI assistant:

  • "Get the Agentforce getting started documentation"
  • "Scrape the Models API reference page"
  • "Extract the GraphQL Send Query endpoint documentation"
  • "Analyze the page structure of this Commerce Cloud API page"
  • "Get all the type definitions from the Einstein Profile Connector API"
  • "Show me the Agent Script language reference"

Quick JSON examples:

Scrape a guide page:

{
  "tool": "scrape_sf_docs",
  "input": {
    "url": "https://developer.salesforce.com/docs/einstein/genai/guide/agent-script.html"
  }
}

Analyze a failing page:

{
  "tool": "analyze_page_structure",
  "input": {
    "url": "https://developer.salesforce.com/docs/commerce/einstein-api/references/einstein-profile-connector?meta=type:CookieIdParam"
  }
}

How It Works

The Salesforce developer docs use a React-based architecture with nested shadow DOM components. This server handles multiple page structures:

Supported Page Types

| Type | URL Pattern | Description | |------|-------------|-------------| | guide | /guide/* | Guide/tutorial pages | | reference | /references/* with markdown | Reference pages with markdown content | | api-reference | /references/*?meta=Summary | API summary pages | | api-type | /references/*?meta=type:* | Type definition pages | | api-method | /references/*?meta=* | Method/endpoint pages | | overview | Landing pages | Overview/landing pages |

Custom Elements Handled

  • doc-heading - Headings with nested shadow DOM
  • doc-content-callout - Tips, notes, warnings
  • dx-code-block - Code snippets with syntax highlighting
  • api-summary - API overview pages
  • api-type-documentation - Type definition pages
  • api-method-documentation - Method/endpoint pages
  • dx-group-text - Landing page content

Agent Usage Guide

For detailed instructions on how AI agents should use these tools, see AGENT_GUIDE.md.


Batch Scraping (Optional)

For batch scraping multiple pages at once, you can use the included scraper script:

// Edit the urls array in scraper.js
const urls = [
  'https://developer.salesforce.com/docs/einstein/genai/guide/get-started.html',
  // Add more URLs here
];

Then run:

npm run scrape

Troubleshooting

| Problem | Solution | |---------|----------| | Empty content with pageType: "fallback" | Use analyze_page_structure to find the right extraction method | | Shadow path not working | Check the DOM snapshot for the correct element names | | Content looks incomplete | Try a different shadowPath or selector | | "Could not find element" error | The shadow path is incorrect - re-analyze the page | | Puppeteer/Chrome issues | Ensure Chrome is installed or set PUPPETEER_EXECUTABLE_PATH |


Dependencies


Disclaimer

  • This repository and MCP server are provided "as is" without warranties or guarantees of any kind, express or implied, including but not limited to functionality, security, merchantability, or fitness for a particular purpose.
  • Use at your own risk. Review the source, perform a security assessment, and harden before any production deployment.
  • Do not expose the HTTP endpoints publicly without proper authentication/authorization, rate limiting, logging, and monitoring.
  • This tool scrapes publicly available Salesforce documentation. Ensure your usage complies with Salesforce's terms of service.
  • You are solely responsible for the protection of your data and compliance with your organization's security policies.

License

MIT