@mcpcn/image-understanding
v1.0.4
Published
Analyze and interpret images with the Zhipu AI GLM vision models
Readme
Image Understanding MCP
An MCP server that uses the Zhipu AI GLM vision models to analyze and describe images.
Features
- Supports multiple GLM vision models (glm-4v-plus, glm-4v, glm-4v-flash)
- Flexible generation controls (temperature, max tokens, etc.)
- Robust error handling with clear messaging
- TypeScript-based implementation with strong typing
Environment Variables
ZHIPU_API_KEYorGLM_API_KEY: Required, Zhipu AI API keyGLM_VISION_MODEL: Optional, override the default model (defaults toglm-4v-plus)
Install & Build
npm install
npm run buildRun
# Run the compiled entry
npm start
# Or invoke the CLI wrapper
image-understanding-mcpMCP Client Configuration
Example entry for Claude Desktop:
{
"mcpServers": {
"image-understanding": {
"command": "node",
"args": ["/absolute/path/to/this/project/dist/index.js"],
"env": {
"ZHIPU_API_KEY": "your-zhipu-key"
}
}
}
}Exposed Tool
image_understanding
Analyze an image via the GLM vision model suite.
Parameters:
imageUrl(required): Image URLprompt(optional): Instruction for the analysis, defaults to “Describe this image in detail.”model(optional): Model id, defaultglm-4v-plustemperature(optional): 0-1, default 0.7maxTokens(optional): 1-4096, default 1024
Supported Models:
glm-4v-plus: Highest fidelity, best for complex scenes (max 4096 tokens)glm-4v: Balanced performance/cost (max 2048 tokens)glm-4v-flash: Fastest inference for simple analysis (max 1024 tokens)
Project Layout
src/
├── index.ts # MCP server entry
├── config.ts # Model options
├── types.ts # Shared types
└── tool.ts # Tool implementationCompatibility
- Node.js >= 18
@modelcontextprotocol/sdkv1
