@mako10k/mcp-image
v1.4.0
Published
MCP Server for AI Image Generation API via JOBAPI
Maintainers
Readme
AI Image API MCP Server
This MCP server proxies every tool invocation to an AI image generation API deployed on Modal.com. No local inference logic is kept; all requests are forwarded directly to the Modal endpoints.
For most workflows, prefer the optimize_and_generate_image tool, which optimizes a prompt and generates an image in one call. Switch to generate_image only when you need to override individual parameters manually.
Features
- Image generation: Create PNG images via the Modal text-to-image API.
- Model catalog: List available models and inspect their metadata.
- Prompt optimization: Call the Modal Job API to refine prompts.
- One-shot optimize + generate: Chain optimization and generation in a single request.
- Modal API pass-through: Forward MCP tool inputs directly to the Modal API payload.
- Resource management: Cache generated images locally and expose them as MCP resources.
I/O とログ方針(JSON-only)
このサーバは Model Context Protocol (MCP) の JSON-RPC を厳密に守るため、以下の方針で入出力を統一しています。
- stdout: MCP の JSON-RPC メッセージ専用。人間向けのログやメッセージは一切出しません。
- stderr: すべてのログを出力(起動メッセージ、バックエンド呼び出しの状況など)。
- ツール応答: JSON のみを返します。必要に応じて画像などのバイナリはメディアコンテンツ(image/png 等)として併送され、text/plain のサマリーは返しません。
最小例(generate_image の概念スケルトン):
{
"content": [
// 画像本体(MCP のバイナリ/画像コンテンツ)
// デフォルトでは base64 を含みません(後述の環境変数で有効化可能)。
{ "type": "image", "mime_type": "image/png", "data": "<base64>" },
// 付随メタデータ(application/json)
{ "type": "application/json", "json": {
"resource_uri": "resource://ai-image-api/image/<uuid>",
"mime_type": "image/png",
"used_params": { "model": "...", "width": 768, "height": 512, "steps": 20, "guidance_scale": 7.5 },
"metadata": { "width": 768, "height": 512, "size_bytes": 658542 }
}}
]
}この方針により、クライアント側でのパースが安定し、stdout に混在出力が混ざることによる JSON-RPC 破損を防ぎます。
Base64 出力の制御(デフォルト無効)
大きな Base64 画像ペイロードは LLM のトークン予算を圧迫するため、ツール応答では既定で送出しません。画像は常にローカルキャッシュへ保存され、resource_uri で参照できます。
- 環境変数
AI_IMAGE_MCP_INCLUDE_BASE64をtrue/1/yes/onに設定すると、ツール応答に画像の Base64 を含めます。 - 既定値は未設定(= false)です。JSON には
resource_uriのみが含まれるため、必要に応じてreadResourceで画像本体を取得してください(image_tokenは内部処理専用です)。
例: Base64 を有効化してサーバを起動
export AI_IMAGE_MCP_INCLUDE_BASE64=true
npx --yes @mako10k/mcp-imageSetup
Install dependencies:
npm installBuild the TypeScript sources:
npm run buildInstall the CLI and avoid bin name conflicts:
# Remove any legacy/unscoped package that publishes the same bin name. npm uninstall -g mcp-image 2>/dev/null || true npm uninstall mcp-image 2>/dev/null || true # Install the scoped CLI locally (recommended) or globally if desired. npm install --save-dev @mako10k/mcp-image # or: npm install -g @mako10k/mcp-imagenpx/npm execresolve binaries by name. A previously installed unscoped package calledmcp-imagecan shadow this CLI and cause silent exits. Ensure only@mako10k/mcp-imageremains installed before running commands such asnpx @mako10k/mcp-image. See the Model Context Protocol guidance on connecting local servers for additional background. [Connect to local MCP servers | modelcontextprotocol.io]Configure your JOB API server URL by exporting environment variables:
export MODAL_JOB_API_URL="https://your-deployment--modal-image-jobapi-serve.modal.run" export JOBAPI_API_KEY="your-api-key-here"Register the server with your MCP client (for example, VS Code):
# Copy the example MCP configuration cp .vscode/mcp.json.example .vscode/mcp.json # Edit .vscode/mcp.json with your actual paths and credentials nano .vscode/mcp.jsonExample configuration:
{ "servers": { "ai-image-api-mcp-server": { "type": "stdio", "command": "node", "args": ["/path/to/mcp-image/dist/index.js"], "env": { "MODAL_JOB_API_URL": "https://your-deployment--modal-image-jobapi-serve.modal.run", "JOBAPI_API_KEY": "your-api-key-here" } } } }
Connecting to the JOB API server
The server connects to a JOB API server for all operations including image generation, model information, and prompt optimization. All requests are proxied through the JOB API server.
Required environment variables:
MODAL_JOB_API_URL(orJOB_API_SERVER_URL) - JOB API server endpointJOBAPI_API_KEY- API key for authentication
Example:
export MODAL_JOB_API_URL=https://your-deployment--modal-image-jobapi-serve.modal.run
export JOBAPI_API_KEY=your-api-key-hereRunning the CLI quickly with npx / npm exec
Once the conflicting package cleanup is complete you can launch the server via
npx --yes @mako10k/mcp-image
# or
npm exec --yes @mako10k/mcp-imageIf you must keep the unscoped mcp-image package for another project, prefer an
explicit path:
./node_modules/.bin/mcp-imageThis ensures the scoped CLI is selected even when other versions are present.
The JOB API server provides the following endpoints:
POST /text-to-image- Image generation (proxies Modal text-to-image)GET /model-configs- Model list (proxies Modal get-model-configs)GET /model-configs/{model_name}- Model detailPOST /optimize_params_v2- Prompt optimizationPOST /optimize_and_generate_image- Combined optimize + generate
Available tools
optimize_and_generate_image (recommended)
Primary tool that optimizes a prompt and generates an image in a single call. It automatically feeds the optimize_prompt result into the generator and returns both the image and the optimization details as JSON.
Key parameters:
query(required): Natural-language description to optimize.target_model: Model name to prefer during optimization.generation_overrides: Object containing overrides applied during generation (same keys asgenerate_image).
Because the recommended settings are applied automatically, start with this tool. Switch to generate_image only when you need fine-grained control.
generate_image
Generate an image directly from a natural-language prompt. Use this when you need precise parameter control or want to bypass the optimizer. Every field you pass from MCP is forwarded to the Modal text-to-image API as-is.
Key parameters:
prompt(required): Prompt used for generation.model: Model name to use. Defaults todreamshaper8.negative_prompt: Elements to exclude.guidance_scale: Classifier-Free Guidance scale.steps: Number of diffusion steps (integer).width/height: Output size (multiples of 64, between 256 and 2048).seed: Random seed (integer).scheduler: Scheduler name.
Any additional fields you supply are forwarded without extra validation. width and height must be multiples of 64 and within 256–2048.
get_available_models
Return the list of supported image generation models.
get_model_detail
Retrieve detailed information for a specific model.
Parameters:
model_name(required): Name of the model to inspect.
optimize_prompt
Optimize a prompt for image generation and surface recommended parameters. Normally you do not need to call this directly because optimize_and_generate_image uses it internally, but it is available when you only need the optimization output.
Parameters:
query(required): Prompt or scene description to optimize.target_model: Optional model to target.
draw_image
Generate an image programmatically from drawing commands. This tool uses the HTML5 Canvas API to render shapes, text, gradients, and composed images without AI inference. Commands are executed sequentially like PostScript or SVG paths.
Parameters:
commands(required): Array of drawing command objects withtypeand parameters.width: Canvas width in pixels (1–4096, default 512).height: Canvas height in pixels (1–4096, default 512).
Supported command types:
- Shapes:
line,curve,rect,circle,ellipse,polygon - Styling:
fill(solid color or gradient),stroke - Text:
text(with font, alignment, baseline) - Composition:
image(load and transform external images) - Transforms:
translate,rotate,scale,setTransform,resetTransform - Path operations:
beginPath,closePath,clearRect - State:
save,restore
Example:
await mcp.callTool('draw_image', {
commands: [
{ type: 'fill', color: '#f0f0f0' },
{ type: 'rect', x: 0, y: 0, width: 512, height: 512 },
{ type: 'fill', color: '#ff6b6b' },
{ type: 'circle', x: 256, y: 200, radius: 80 },
{ type: 'fill', gradient: { type: 'linear', x0: 156, y0: 300, x1: 356, y1: 450, stops: [
{ offset: 0, color: '#4dabf7' },
{ offset: 1, color: '#51cf66' }
]}},
{ type: 'rect', x: 156, y: 300, width: 200, height: 150 }
],
width: 512,
height: 512
});search_images
Search the local cache of generated images. Modal does not provide a search endpoint, so this server filters its own metadata store.
Parameters:
query: Keyword to match within prompts (substring match).model: Restrict results to images generated by a specific model.limit: Maximum number of results (1–20, default 5).before: Only include images generated before this ISO 8601 timestamp.after: Only include images generated after this ISO 8601 timestamp.
Managing generated resources
- Generated PNG files are stored at
~/.cache/ai-image-api-mcp/images. - Metadata accumulates at
~/.cache/ai-image-api-mcp/metadata.jsonand is exposed via the MCP resource API. - Resource URI format:
resource://ai-image-api/image/<uuid>. - The
generate_imageresponse includes theresourceUrifor the saved image. - The
search_imagestool queries the local metadata cache.
Inspecting cached images
npm run single-testAfter the script completes, the latest image appears in the resources/list response and can be previewed directly from an MCP client such as VS Code.
API endpoints
All API calls are proxied through the JOB API server:
- Image generation:
POST /text-to-image - Model list:
GET /model-configs - Model detail:
GET /model-configs/{model_name} - Prompt optimization:
POST /optimize_params_v2 - Combined workflow:
POST /optimize_and_generate_image
Configure the JOB API server URL via MODAL_JOB_API_URL environment variable.
Examples
// Sample MCP client usage
// 1. Optimize and generate (recommended)
await mcp.callTool('optimize_and_generate_image', {
query: 'Futuristic cityscape at sunset viewed from above',
generation_overrides: {
width: 768,
height: 512
}
});
// 2. Direct generation when you need manual tweaks
await mcp.callTool('generate_image', {
prompt: 'Futuristic cityscape at sunset viewed from above',
model: 'dreamshaper8',
negative_prompt: 'blurry, low quality',
width: 768,
height: 512,
steps: 20,
guidance_scale: 7.5
});
// 3. List available models
await mcp.callTool('get_available_models', {});
// 4. Obtain only the optimization output
await mcp.callTool('optimize_prompt', {
query: 'A cat relaxing in a garden'
});
// 5. Generate programmatic drawings
await mcp.callTool('draw_image', {
commands: [
{ type: 'fill', color: '#ffffff' },
{ type: 'rect', x: 0, y: 0, width: 512, height: 512 },
{ type: 'fill', color: '#ff6b6b' },
{ type: 'circle', x: 256, y: 200, radius: 80 },
{ type: 'fill', gradient: { type: 'linear', x0: 100, y0: 300, x1: 400, y1: 450, stops: [
{ offset: 0, color: '#4dabf7' },
{ offset: 1, color: '#51cf66' }
]}},
{ type: 'rect', x: 156, y: 300, width: 200, height: 150 }
],
width: 512,
height: 512
});Development
Run the server in development mode:
npm run devLicense
MIT License
