@sfdxy/sf-documentation-knowledge
v1.2.0
Published
Salesforce documentation knowledge system — collect, process, and serve SF docs for LLM agents via MCP + Context Engineering
Maintainers
Readme
@sfdxy/sf-documentation-knowledge
Collect, process, and serve Salesforce documentation for LLM agents — using Context Engineering + MCP, not RAG.
Overview
This system programmatically collects all Salesforce documentation from developer.salesforce.com (121 domains, 31,000+ pages), processes it into structured, curated knowledge files, and serves them to LLM agents via:
- Context Engineering — Pre-compiled Markdown files with
_index.mdrouting tables - MCP Server — 5 tools + 3 prompt templates via Model Context Protocol
- Knowledge Graph — 53,000+ nodes and 450,000+ edges connecting SF concepts, namespaces, services, and cross-references
No embeddings. No vector stores. No blind chunking.
Quick Start
Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"sf-docs": {
"command": "npx",
"args": ["-y", "@sfdxy/sf-documentation-knowledge"]
}
}
}Restart Claude Desktop.
VS Code (GitHub Copilot)
Add to .vscode/mcp.json in your workspace (or globally in VS Code settings):
{
"servers": {
"sf-docs": {
"command": "npx",
"args": ["-y", "@sfdxy/sf-documentation-knowledge"]
}
}
}Then use @sf-docs in Copilot Chat to query Salesforce documentation.
Gemini Code Assist / Gemini CLI
Add to your MCP config (~/.gemini/settings.json or project .gemini/settings.json):
{
"mcpServers": {
"sf-docs": {
"command": "npx",
"args": ["-y", "@sfdxy/sf-documentation-knowledge"]
}
}
}Cursor
Add in Settings → MCP Servers → Add Server:
- Name:
sf-docs - Command:
npx -y @sfdxy/sf-documentation-knowledge - Transport:
stdio
Windsurf
Add to ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"sf-docs": {
"command": "npx",
"args": ["-y", "@sfdxy/sf-documentation-knowledge"]
}
}
}Any MCP Client (Generic)
Point your MCP client to:
npx -y @sfdxy/sf-documentation-knowledgeThe server uses stdio transport and is compatible with any MCP client.
Use from Source
git clone https://github.com/Avinava/sf-documentation-knowledge.git
cd sf-documentation-knowledge
npm install
npm run build
npm run mcp:startMCP Server
The MCP server loads the full 53k-node knowledge graph into memory on startup (~3s) and serves all queries instantly.
Tools
| Tool | Purpose | Example Usage |
|---|---|---|
| sf_search | Search across all 121 SF documentation domains | "Find docs about Platform Events" |
| sf_read_topic | Read a specific documentation topic's content | Read the SOQL reference page |
| sf_graph_query | Navigate the knowledge graph — related docs, namespaces, services | "Show all docs in the System namespace" |
| sf_list_domains | List all available domains, filter by service category | "List analytics domains" |
| sf_apex_lookup | Look up an Apex class with full documentation | "Look up the String class" |
Prompt Templates
| Prompt | What It Does | Arguments |
|---|---|---|
| explore_api | Walk through a Salesforce API — endpoints, auth, best practices | api: API name (e.g., "REST API") |
| debug_apex | Debug an Apex issue — class lookup, error patterns, examples | topic: class/error (e.g., "System.QueryException") |
| compare_services | Compare Salesforce products by documentation coverage | services: categories (e.g., "analytics vs commerce") |
Knowledge Base
The repository comes pre-loaded with 31,000+ curated markdown files and a Knowledge Graph (53,000+ nodes, 450,000+ edges) covering 121 domains of Salesforce documentation.
Option A: Context Engineering (File-based)
Point your AI agent to the _index.md file in any domain folder. The index acts as a routing table telling the AI which files contain which topics:
knowledge/current/<domain-name>/_index.mdEach domain folder also has a SKILL.md in skills/<domain-name>/SKILL.md that teaches AI agents how to navigate the knowledge.
Option B: Knowledge Graph
The graph at knowledge/current/graph.json connects all documentation with semantic relationships:
| Edge Type | What It Connects |
|---|---|
| references | Document → Document (52,988 cross-references) |
| belongs_to_namespace | Document → Apex Namespace (143 namespaces) |
| belongs_to_service | Domain → Service Category (16 categories) |
| is_type | Document → DocType (api-reference, developer-guide, concept, etc.) |
| tagged_with | Document → Keyword (22,610 unique keywords) |
| contains | Domain → Document |
Inspect it with:
npm run graph:statsSee Graph Schema Documentation for the full schema with node/edge types, ID conventions, and a visual diagram.
Data Pipeline
To update the knowledge base with the latest Salesforce releases, run the pipeline in order:
Step 1: Discover Available Deliverables
npm run discoverLists all documentation deliverables available from the Salesforce Index API (~127 deliverables).
Step 2: Collect Raw Data
# Collect a specific domain
npm run collect -- --domain cli-commands
# Collect all configured (P0) domains
npm run collect
# Collect ALL deliverables from the SF index API (121 domains, ~31k pages)
npm run collect -- --discoverStep 3: Process HTML to Markdown
# Process a specific domain
npm run process -- --domain cli-commands
# Process ALL collected domains
npm run process -- --discoverAutomatically cleans HTML, strips noise, parses tables, formats code blocks, creates clean Markdown, and redacts any Salesforce tokens or secrets.
Step 4: Generate Knowledge Files & Graph
# Generate ALL collected domains and rebuild the full Knowledge Graph
npm run generate -- --discoverBuilds the knowledge graph (cross-references, namespaces, service categories, doctype clustering), generates context files, and updates inventory docs.
Step 5: Inspect the Graph
npm run graph:statsFull Pipeline (One-liner)
npm run collect -- --discover && npm run process -- --discover && npm run generate -- --discoverCLI Reference
| Command | Description |
|---|---|
| npm run discover | List available SF documentation deliverables |
| npm run collect | Download raw HTML documentation |
| npm run process | Convert HTML → Markdown with tagging |
| npm run generate | Generate knowledge files + graph |
| npm run graph:stats | Analyze the knowledge graph |
| npm run mcp:start | Start the MCP server (stdio) |
| npm run build | Compile TypeScript |
| npm run test | Run test suite |
| npm run lint | Run ESLint |
All pipeline commands support --domain <name> for single-domain processing and --discover for all-domain processing.
CI/CD
| Workflow | Trigger | What It Does |
|---|---|---|
| CI | Push / PR to master | Build, test, lint, MCP smoke test |
| Release | Push v* tag | Build, test, publish to npm, create GitHub release |
| Update Knowledge | Weekly (Sunday) | Run full pipeline to refresh docs |
Documentation
| Document | Description | |---|---| | Architecture | System design, data flow, 4-layer architecture | | Graph Schema | Node/edge types, ID conventions, query examples | | Full Inventory | Complete list of all 121 documentation domains | | Repo Development Skill | How to develop and extend this repo |
License
MIT © Avinava
Inventory
| Domain | Description | Status | Files | |---|---|---|---| | Salesforce Field Reference Guide | Use this concise reference to quickly look up details of the standard fields for | ✅ Available | 4817 | | Apex Reference | Apex class library reference — all system classes and methods | ✅ Available | 4604 | | Connect REST API Developer Guide | Integrate mobile apps, intranet sites, and third-party web applications with Sal | ✅ Available | 2431 | | Object Reference for the Salesforce Platform | Get details on standard objects so that you can interface with your Salesforce d | ✅ Available | 1774 | | OmniStudio | OmniStudio — OmniScripts, FlexCards, DataRaptors, Integration Procedures | ✅ Available | 1295 | | Revenue Cloud / Agentforce Revenue Management | Product catalog, pricing, billing, Dynamic Revenue Orchestrator | ✅ Available | 1209 | | Public Sector Solutions Developer Guide | Use Public Sector Solutions API and developer resources to unify public service | ✅ Available | 1003 | | Salesforce Health Cloud Developer Guide | Use the Health Cloud API to configure the Health Cloud console, which helps care | ✅ Available | 832 | | Life Sciences Cloud Developer Guide | Use the developer resources of Life Sciences Cloud to automate the operations av | ✅ Available | 699 | | Metadata API | Metadata API — deployment, retrieval, metadata types | ✅ Available | 683 | | Insurance Developer Guide | Learn more about the developer sources of Insurance to automate the backend work | ✅ Available | 616 | | Visualforce Developer Guide | Learn how to develop custom user interfaces and apps with Visualforce, a framewo | ✅ Available | 609 | | Apex Developer Guide | Apex language guide — syntax, triggers, testing, best practices | ✅ Available | 542 | | Financial Services Cloud Developer Guide | Extend Financial Services Cloud with other Salesforce products using the API and | ✅ Available | 526 | | Consumer Goods Cloud Developer Guide | Use APIs and developer resources to configure, customize, and extend the capabil | ✅ Available | 522 | | CRM Analytics REST API Developer Guide | Describes how to send queries directly to CRM Analytics, access datasets that ha | ✅ Available | 519 | | Loyalty Management Developer Guide | Use Loyalty Management API and developer resources to create personalized loyalt | ✅ Available | 501 | | Lightning Aura Components Developer Guide | Create Aura components for Salesforce for Android, iOS, and mobile web and Light | ✅ Available | 491 | | Data Cloud | Data Cloud developer guide — data models, connectors, identity resolution | ✅ Available | 400 | | ISVforce Guide | Plan, build, and sell AppExchange solutions and consulting services. | ✅ Available | 347 | | Tooling API | Tooling API — code coverage, debug logs, custom fields | ✅ Available | 338 | | Einstein Discovery REST API Developer Guide | Describes how to send queries directly to CRM Analytics, access datasets that ha | ✅ Available | 312 | | REST API | Salesforce REST API — resources, methods, composite, batch | ✅ Available | 307 | | Nonprofit Cloud Developer Guide | Use APIs and developer resources to configure, customize, and extend the capabil | ✅ Available | 304 | | Education Cloud Developer Guide | Education Cloud gives you the tools and developer resources you need to support | ✅ Available | 301 | | Data Prep Recipe REST API Developer Guide | Describes how to retrieve, update, and schedule Data Prep recipes. | ✅ Available | 296 | | Service Cloud | Service Cloud — cases, knowledge, omni-channel, entitlements | ✅ Available | 282 | | Net Zero Cloud Developer Guide | Use Net Zero API and developer resources to integrate a complete sustainability | ✅ Available | 265 | | User Interface API Developer Guide | User Interface API enables you to create native mobile apps and custom web apps | ✅ Available | 257 | | Field Service | Field Service — work orders, scheduling, mobile, territories | ✅ Available | 247 | | + 91 more domains | See full inventory | ✅ Available | 5,859 |
121 domains | 33,188 knowledge files
