uns-mcp-server
v2.0.2
Published
Pure JavaScript MCP server for Unstructured.io - No Python required!
Maintainers
Readme
Unstructured MCP Server
Pure JavaScript MCP server for document processing with Unstructured.io API. Process PDFs, Word documents, HTML, images, and more directly from Claude Desktop and other AI clients - no Python required!
Features
- 📄 Multi-format Support: PDF, DOCX, HTML, images (with OCR), and more
- 🚀 NPX Executable: No local installation required
- 🤖 Claude Desktop Integration: Works seamlessly with Claude
- ⚡ Pure JavaScript: No Python dependencies
- 🔍 OCR Support: Extract text from scanned documents
- 📊 Table Extraction: Extract and convert tables
- 📝 Multiple Output Formats: JSON, text, markdown
Quick Start
1. Get API Key
Sign up at https://unstructuredapp.io to get your API key.
2. Run with NPX (No Installation)
# Run directly with NPX
UNSTRUCTURED_API_KEY=your_key_here npx uns-mcp-server3. Add to Claude Desktop
Add to your Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):
{
"mcpServers": {
"uns-mcp": {
"command": "npx",
"args": ["uns-mcp-server"],
"env": {
"UNSTRUCTURED_API_KEY": "your_key_here"
}
}
}
}Installation Options
Global Install
npm install -g uns-mcp-server
uns-mcp-serverLocal Project
npm install uns-mcp-server
npx uns-mcp-serverAvailable Tools
The MCP server provides the following tools:
Document Processing
process_document- Process any document with OCR and formatting preservationextract_text- Extract plain text from documentsextract_tables- Extract tables in JSON, CSV, or HTML format
Connectors
list_sources- List configured document sourcescreate_source_connector- Create input source (S3, Azure, local, etc.)
Usage Examples
Process a PDF
// In Claude Desktop, use the tool directly
await process_document({
file_path: "/path/to/document.pdf",
strategy: "hi_res",
ocr_enabled: true,
output_format: "json"
});Extract Text
await extract_text({
file_path: "/path/to/document.pdf",
include_metadata: true
});Extract Tables
await extract_tables({
file_path: "/path/to/spreadsheet.xlsx",
format: "csv"
});Supported Formats
- Documents: PDF, DOCX, DOC, ODT, RTF, TXT
- Images: PNG, JPG, JPEG, TIFF, BMP (with OCR)
- Web: HTML, XML
- Spreadsheets: XLSX, XLS, CSV
- Presentations: PPTX, PPT
- Email: EML, MSG
Processing Strategies
auto- Automatically select the best strategyhi_res- High resolution processing with layout preservationocr_only- Focus on OCR for scanned documentsfast- Quick processing for simple documents
Environment Variables
UNSTRUCTURED_API_KEY- Your Unstructured.io API key (required)UNSTRUCTURED_API_URL- Custom API endpoint (optional)LOG_LEVEL- Logging level: ERROR, WARN, INFO, DEBUG (default: ERROR)
Testing
Test your installation:
# Create a test file
echo "Hello World" > test.txt
# Process it
npx uns-mcp-server test.txtTroubleshooting
API Key Issues
# Verify your API key is set
echo $UNSTRUCTURED_API_KEY
# Set it for current session
export UNSTRUCTURED_API_KEY=your_key_hereConnection Issues
- Ensure you have internet connectivity
- Verify the API key is valid
- Check if you're behind a corporate firewall
Changelog
v2.0.2
- Updated GitHub repository to CG-Labs organization
- Documentation improvements
v2.0.1
- Fixed API endpoint to use correct domain
- Improved error handling
- Better connection stability
v2.0.0
- Complete rewrite in pure JavaScript
- Removed all Python dependencies
- Direct API integration
- Improved performance
v1.0.x
- Initial release with Python bridge
License
MIT License
Support
- Issues: GitHub Issues
- GitHub: github.com/CG-Labs/Unstructured-Document-Processor-MCP
- NPM: npmjs.com/package/uns-mcp-server
- Unstructured.io: Documentation
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Made with ❤️ for the Claude Desktop community
