@visionengine/image-recognize
v1.0.4
Published
VisionEngine Image Recognition MCP Server - Analyze visual design elements and extract text from images
Maintainers
Readme
@visionengine/image-recognize
VisionEngine Image Recognition MCP Server - Analyze visual design elements and extract text from images using advanced AI vision models.
Features
- Visual Design Analysis - Analyze layout, colors, typography, composition and design principles
- Text & Data Extraction - Extract text, tables, charts and structured data from images (OCR)
- Flexible Prompts - Custom prompts or use intelligent defaults
- Multiple Image Formats - Support for JPG, PNG, GIF, WebP
Installation
As MCP Server
Add to your MCP client configuration:
{
"mcpServers": {
"ve-image-recognize": {
"type": "local",
"command": "npx",
"args": ["-y", "@visionengine/image-recognize@latest"],
"transport": "stdio",
"env": {
"API_URL": "https://openrouter.ai",
"API_KEY": "your_api_key_here",
"WORKDIR": "./public"
}
}
}
}As NPM Package
npm install -g @visionengine/image-recognizeConfiguration
Environment variables:
API_URL- VisionEngine API endpoint (default: https://openrouter.ai)API_KEY- Your VisionEngine API key (required)WORKDIR- Base directory for relative image paths (default: ./)
Tools
visual
Analyze visual design elements in an image.
Parameters:
imagePath(string, required) - Image file path (relative to WORKDIR or absolute)prompt(string, optional) - Custom analysis prompt
Default Analysis:
- Layout and composition
- Color scheme and palette
- Typography and text styling
- Visual hierarchy
- Design style and characteristics
- Visual focal points
- Design principles (contrast, balance, alignment, etc.)
Example:
// Using default prompt
await visual({
imagePath: "./design.png"
});
// Using custom prompt
await visual({
imagePath: "./design.png",
prompt: "分析这个网页设计的配色方案和视觉层次"
});text
Extract text content and data information from an image.
Parameters:
imagePath(string, required) - Image file path (relative to WORKDIR or absolute)prompt(string, optional) - Custom extraction prompt
Default Extraction:
- All text content (OCR)
- Table data
- Chart data
- Numbers and statistics
- Labels, titles, captions
- Structured data
Example:
// Using default prompt
await text({
imagePath: "./document.jpg"
});
// Using custom prompt
await text({
imagePath: "./document.jpg",
prompt: "提取图片中的所有表格数据,并转换为 Markdown 格式"
});Usage Examples
MCP Client
Once configured as an MCP server, the tools are available through your MCP client:
> Use visual tool on screenshot.png
> Use text tool to extract data from invoice.jpgDirect Usage
# Install globally
npm install -g @visionengine/image-recognize
# Set environment variables
export API_KEY="your_api_key"
export WORKDIR="./images"
# Run the server
ve-image-recognizeDevelopment
Build
pnpm buildLocal Testing
# Build first
pnpm build
# Run locally
node dist/index.jsSupport
For issues and questions:
- Email: [email protected]
- Website: https://visionengine-tech.com
