documents-mcp
v1.1.2
Published
MCP server for creating and reading PDF, DOCX, and PPTX documents
Maintainers
Readme
Documents MCP
An MCP (Model Context Protocol) server that provides AI agents with tools to create and read PDF, DOCX, and PPTX documents.
Features
- Create PDF - Generate PDF documents with text, headings, tables, images, and page numbers
- Create DOCX - Generate Word documents with headings, paragraphs, lists, tables, and images
- Create PPTX - Generate PowerPoint presentations with slides, text, shapes, tables, and charts
- Read PDF - Extract text content and metadata from PDF files (supports Gemini AI analysis)
- Read DOCX - Extract text and HTML content from Word documents (supports Gemini AI analysis)
- Read PPTX - Extract text content from PowerPoint presentations (supports Gemini AI analysis)
Installation
npm install documents-mcpOr use directly with npx:
npx documents-mcpUsage with Claude Desktop
Add to your Claude Desktop configuration (~/.config/claude/claude_desktop_config.json on macOS/Linux or %APPDATA%\Claude\claude_desktop_config.json on Windows):
{
"mcpServers": {
"documents-mcp": {
"command": "npx",
"args": ["documents-mcp"],
"env": {
"OUTPUT_DIR": "/path/to/output/documents"
}
}
}
}Environment Variables
All environment variables are optional. Configure only the providers you need:
AI Provider API Keys
| Variable | Description |
|----------|-------------|
| OPENAI_API_KEY | OpenAI API key |
| ANTHROPIC_API_KEY | Anthropic API key |
| GOOGLE_API_KEY | Google AI API key |
| GEMINI_API_KEY | Gemini API key (alias for GOOGLE_API_KEY) |
| PERPLEXITY_API_KEY | Perplexity API key |
| XAI_API_KEY | xAI (Grok) API key |
| GROQ_API_KEY | Groq API key |
Local Model Endpoints
| Variable | Default | Description |
|----------|---------|-------------|
| OLLAMA_BASE_URL | http://localhost:11434 | Ollama server URL |
| LMSTUDIO_BASE_URL | http://localhost:1234 | LM Studio server URL |
| VLLM_BASE_URL | http://localhost:8000 | vLLM server URL |
Output Configuration
| Variable | Default | Description |
|----------|---------|-------------|
| OUTPUT_DIR | Current directory | Default directory for saved documents |
Tool Reference
create-pdf
Create a PDF document with structured content.
Parameters:
title(required): Document titleauthor(optional): Document authorcontent(required): Array of content items:{ type: "text", content: string, fontSize?: number, bold?: boolean, color?: {r, g, b} }{ type: "heading", content: string, level: 1-6 }{ type: "table", headers: string[], rows: string[][] }{ type: "image", base64: string, format: "png"|"jpg"|"jpeg", width?: number, height?: number }{ type: "pageBreak" }
outputPath(optional): File path to save the PDFpageSize(optional): "A4", "Letter", or "Legal"
create-docx
Create a Word document with rich formatting.
Parameters:
title(required): Document titleauthor(optional): Document authorcontent(required): Array of content items:{ type: "text", content: string, bold?: boolean, italic?: boolean, underline?: boolean }{ type: "heading", content: string, level: 1-6 }{ type: "paragraph", content: string, alignment?: "left"|"center"|"right"|"justified" }{ type: "bulletList", items: string[] }{ type: "numberedList", items: string[] }{ type: "table", headers: string[], rows: string[][] }{ type: "image", base64: string, width?: number, height?: number }{ type: "pageBreak" }
outputPath(optional): File path to save the DOCX
create-pptx
Create a PowerPoint presentation.
Parameters:
title(required): Presentation titleauthor(optional): Presentation authorslides(required): Array of slide objects:title(optional): Slide titlesubtitle(optional): Slide subtitlelayout: "title", "titleAndContent", "blank", or "sectionHeader"elements: Array of elements:{ type: "textBox", text: string, x?, y?, w?, h?, fontSize?, bold?, color?, align? }{ type: "image", base64: string, x?, y?, w?, h? }{ type: "shape", shapeType: "rect"|"ellipse"|"triangle"|"line"|"arrow", ... }{ type: "table", headers: string[], rows: string[][], x?, y?, w? }{ type: "chart", chartType: "bar"|"line"|"pie"|"doughnut", data: [...], ... }
backgroundColor(optional): Slide background colornotes(optional): Speaker notes
outputPath(optional): File path to save the PPTX
read-pdf
Extract text content from a PDF file.
Parameters:
filePath(optional): Path to the PDF filebase64Content(optional): Base64-encoded PDF contentprompt(optional): Instruction for AI analysis (requiresGOOGLE_API_KEY)
Returns: { text, metadata: { pageCount, info, version }, characterCount, wordCount, aiAnalysis? }
read-docx
Extract text content from a Word document.
Parameters:
filePath(optional): Path to the DOCX filebase64Content(optional): Base64-encoded DOCX contentoutputFormat(optional): "text", "html", or "both"prompt(optional): Instruction for AI analysis (requiresGOOGLE_API_KEY)
Returns: { text?, html?, characterCount, wordCount, aiAnalysis? }
read-pptx
Extract text content from a PowerPoint presentation.
Parameters:
filePath(optional): Path to the PPTX filebase64Content(optional): Base64-encoded PPTX contentprompt(optional): Instruction for AI analysis (requiresGOOGLE_API_KEY)
Returns: { text, slideCount, slides: [{slide, text}], aiAnalysis? }
Running as HTTP Server
Start the HTTP server for web-based MCP clients:
npm run start:http
# or
npx documents-mcp-httpThe HTTP server exposes:
GET /sse- SSE endpoint for MCP clientsPOST /messages?sessionId=<id>- Message endpoint for SSE sessionsGET /health- Health check endpoint
Client SDK
Use the client SDK to connect to the documents-mcp server programmatically:
import { createClient } from "documents-mcp/client";
// Connect via STDIO (for CLI usage)
const client = createClient({
transport: "stdio",
command: "documents-mcp",
});
await client.connect();
// List available tools
const tools = await client.listTools();
console.log(tools);
// Create a PDF
const result = await client.createPdf({
title: "My Document",
content: [
{ type: "heading", content: "Hello World", level: 1 },
{ type: "text", content: "This is a sample document." },
],
});
console.log(result.content); // { success: true, filePath: "...", pageCount: 1 }
await client.disconnect();SSE Transport
For HTTP/SSE connections:
const client = createClient({
transport: "sse",
url: "http://localhost:3000/sse",
});
await client.connect();
// ... use client methods
await client.disconnect();Available Client Methods
| Method | Description |
|--------|-------------|
| connect() | Connect to the MCP server |
| disconnect() | Disconnect from the server |
| listTools() | List available tools |
| callTool(name, args) | Call any tool by name |
| createPdf(options) | Create a PDF document |
| createDocx(options) | Create a Word document |
| createPptx(options) | Create a PowerPoint presentation |
| readPdf(options) | Extract text from a PDF |
| readDocx(options) | Extract text from a Word document |
| readPptx(options) | Extract text from a PowerPoint |
Programmatic Usage (Direct)
You can also use the document tools directly without the MCP server:
import { createPdf, readPdf } from "documents-mcp";
// Create a PDF
const result = await createPdf.handler(
createPdf.schema.parse({
title: "My Document",
content: [{ type: "text", content: "Hello World" }],
})
);
// Read a PDF
const extracted = await readPdf.handler(
readPdf.schema.parse({
filePath: "/path/to/document.pdf",
})
);Development
# Clone the repository
git clone https://github.com/HarjjotSinghh/documents-mcp.git
cd documents-mcp
# Install dependencies
npm install
# Run in development mode
npm run dev
# Build for production
npm run buildLicense
MIT
