npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@michelabboud/visual-forge-mcp

v0.12.2

Published

MCP server for AI image generation from markdown and HTML with multi-provider support, model testing, and automatic web optimization

Readme

Visual Forge MCP 🎨

AI-powered image generation for technical documentation with multi-provider support

Visual Forge MCP Hero Banner

An MCP (Model Context Protocol) server that automates AI image generation for technical documentation. Parse markdown files containing image prompts and generate professional images using multiple AI providers.

npm version Version License Node.js


🌟 Features

  • 📝 Multi-Format Support: Extract image specifications from Markdown (.md) and HTML (.html) files
  • 🔌 Multi-Provider Support: 8 AI providers with automatic fallback
    • OpenAI GPT Image
    • Google Gemini 2.5 Flash Image (Nano Banana)
    • Stability AI SDXL
    • Replicate FLUX Schnell
    • Leonardo Phoenix
    • HuggingFace Inference
    • xAI Grok 2 Image
    • Z.ai GLM-Image ✨ NEW - Excellent for text-heavy diagrams
  • 🎯 Multi-Model Architecture ✨ NEW v0.8.0: Each provider can offer multiple models with different capabilities and pricing (e.g., OpenAI GPT Image 1 vs GPT Image 1 HD)
  • 🧪 Model Testing & Comparison ✨ NEW v0.9.0: Test and compare AI models before production use
    • Standard automated tests with quality scoring (sharpness, brightness, text rendering, color accuracy)
    • Custom prompt testing with real use cases
    • Side-by-side multi-provider comparison with intelligent recommendations
    • Permission flow for cost-aware testing
    • Persistent test results for historical tracking
  • 🎨 Detailed Global Context: Comprehensive styling system with hex colors, typography, layout rules, and audience targeting for dramatically better, more consistent images
  • 🖼️ Multi-Format Optimization ✨ NEW v0.7.0: Automatic generation of WebP (94% smaller), JPEG (85% smaller), and optional lossy PNG (70% smaller) with professional watermarking
  • 🔍 Quality Validation & Auto-Regeneration ✨ NEW v0.7.0: OCR-based text detection, sharpness/brightness analysis, and automatic retry on quality failure
  • 📁 Index-Based Directories ✨ NEW v0.7.0: Collision-free sequential directory naming with complete metadata and generation logs per image
  • ✏️ Enhanced Prompting ✨ NEW v0.7.0: Type-specific prompt optimization to minimize AI text rendering errors
  • 💰 Cost-Effective: Options from $0.00 to $0.12/image
  • ⚡ Flexible Workflows: Interactive, Batch, and Bulk generation modes
  • 🔄 State Persistence: Resume interrupted sessions without losing progress
  • 📊 Cost Tracking: Real-time cost monitoring and estimates per provider
  • 🎯 Smart Rate Limiting: Token bucket algorithm prevents API bans
  • 🔒 Automatic Backups: File backup system with approve/restore workflow (optional, enabled by default)
  • 📄 PDF Generation ✨ NEW v0.11.0: Generate PDFs from markdown using Typst engine with XeLaTeX fallback, custom templates, and automatic WebP to PNG conversion
  • 🌐 Universal MCP Support: Compatible with 24+ MCP clients (tested with Claude Code, Claude Desktop, Zed)

📦 Installation

Prerequisites

  • Node.js 24+ (recommended for optimal performance)
  • At least one AI provider API key (see Provider Setup)

Option 1: Install from NPM

npm install -g @michelabboud/visual-forge-mcp

Option 2: Install from Source (Current)

# Clone the repository
git clone https://github.com/michelabboud/visual-forge-mcp.git
cd visual-forge-mcp

# Install dependencies
npm install

# Build the project
npm run build

# (Optional) Link globally for CLI usage
npm link

⚙️ Configuration

1. Provider Setup

Create a .env file in the project root with at least one provider API key:

# === AI PROVIDER API KEYS ===
# Choose at least one provider below

# Google Gemini 2.5 Flash Image ($0.039/image)
GOOGLE_API_KEY=AIza...

# Replicate FLUX Schnell ($0.003/image) - Most cost-effective
REPLICATE_API_TOKEN=r8_...

# Stability AI SDXL ($0.04/image)
STABILITY_API_KEY=sk-...

# OpenAI GPT Image ($0.04-0.12/image)
OPENAI_API_KEY=sk-...

# xAI Grok 2 Image ($0.07/image)
XAI_API_KEY=xai-...

# Z.ai GLM-Image ($0.015/image) - NEW: Excellent for text-heavy diagrams
ZAI_API_KEY=zai-...

# Leonardo Phoenix ($0.02/image)
LEONARDO_API_KEY=...

# HuggingFace Inference (no-cost models available)
HUGGINGFACE_API_KEY=hf_...

# === OPTIONAL CONFIGURATION ===
IMAGE_GEN_OUTPUT_DIR=./generated-images
IMAGE_GEN_STATE_DIR=~/.visual-forge-mcp
IMAGE_GEN_LOG_LEVEL=info
IMAGE_GEN_DEFAULT_PROVIDER=gemini

Get API Keys:

2. MCP Server Configuration

Add Visual Forge MCP to your MCP client configuration.

Two configuration options:

  • Option A (npm package): Use npx to run the published npm package - no local install needed
  • Option B (local source): Use absolute path to your cloned repository

🧪 Tested Clients

These configurations have been tested and verified:

Claude Code

Edit ~/.claude/settings.json:

Option A: Using npm package (recommended)

{
  "mcpServers": {
    "visual-forge": {
      "command": "npx",
      "args": ["-y", "@michelabboud/visual-forge-mcp"],
      "env": {
        "GOOGLE_API_KEY": "${GOOGLE_API_KEY}",
        "REPLICATE_API_TOKEN": "${REPLICATE_API_TOKEN}",
        "STABILITY_API_KEY": "${STABILITY_API_KEY}",
        "OPENAI_API_KEY": "${OPENAI_API_KEY}",
        "XAI_API_KEY": "${XAI_API_KEY}"
      }
    }
  }
}

Option B: Using local source

{
  "mcpServers": {
    "visual-forge": {
      "command": "node",
      "args": ["/path/to/visual-forge-mcp/dist/index.js"],
      "env": {
        "GOOGLE_API_KEY": "${GOOGLE_API_KEY}",
        "REPLICATE_API_TOKEN": "${REPLICATE_API_TOKEN}",
        "STABILITY_API_KEY": "${STABILITY_API_KEY}",
        "OPENAI_API_KEY": "${OPENAI_API_KEY}",
        "XAI_API_KEY": "${XAI_API_KEY}"
      }
    }
  }
}
Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

Option A: Using npm package (recommended)

{
  "mcpServers": {
    "visual-forge": {
      "command": "npx",
      "args": ["-y", "@michelabboud/visual-forge-mcp"],
      "env": {
        "GOOGLE_API_KEY": "your-key-here",
        "REPLICATE_API_TOKEN": "your-token-here"
      }
    }
  }
}

Option B: Using local source

{
  "mcpServers": {
    "visual-forge": {
      "command": "node",
      "args": ["/path/to/visual-forge-mcp/dist/index.js"],
      "env": {
        "GOOGLE_API_KEY": "your-key-here",
        "REPLICATE_API_TOKEN": "your-token-here"
      }
    }
  }
}
Zed

Edit ~/.config/zed/settings.json:

Option A: Using npm package (recommended)

{
  "context_servers": {
    "visual-forge": {
      "command": {
        "path": "npx",
        "args": ["-y", "@michelabboud/visual-forge-mcp"]
      }
    }
  }
}

Option B: Using local source

{
  "context_servers": {
    "visual-forge": {
      "command": {
        "path": "node",
        "args": ["/path/to/visual-forge-mcp/dist/index.js"]
      }
    }
  }
}

🚀 MCP-Compatible Clients

These clients support MCP protocol and should work with Visual Forge. Configuration formats may vary slightly:

Cursor

Settings → Features → MCP:

Option A: Using npm package (recommended)

{
  "visual-forge": {
    "command": "npx",
    "args": ["-y", "@michelabboud/visual-forge-mcp"]
  }
}

Option B: Using local source

{
  "visual-forge": {
    "command": "node",
    "args": ["/path/to/visual-forge-mcp/dist/index.js"]
  }
}
Windsurf

Edit ~/.windsurf/settings.json:

Option A: Using npm package (recommended)

{
  "mcpServers": {
    "visual-forge": {
      "command": "npx",
      "args": ["-y", "@michelabboud/visual-forge-mcp"]
    }
  }
}

Option B: Using local source

{
  "mcpServers": {
    "visual-forge": {
      "command": "node",
      "args": ["/path/to/visual-forge-mcp/dist/index.js"]
    }
  }
}
Cline (VS Code Extension)

Edit ~/.continue/config.json or use Cline's settings UI:

Option A: Using npm package (recommended)

{
  "experimental": {
    "mcpServers": {
      "visual-forge": {
        "command": "npx",
        "args": ["-y", "@michelabboud/visual-forge-mcp"]
      }
    }
  }
}

Option B: Using local source

{
  "experimental": {
    "mcpServers": {
      "visual-forge": {
        "command": "node",
        "args": ["/path/to/visual-forge-mcp/dist/index.js"]
      }
    }
  }
}
Continue (VS Code Extension)

Edit ~/.continue/config.json:

Option A: Using npm package (recommended)

{
  "experimental": {
    "mcpServers": {
      "visual-forge": {
        "command": "npx",
        "args": ["-y", "@michelabboud/visual-forge-mcp"]
      }
    }
  }
}

Option B: Using local source

{
  "experimental": {
    "mcpServers": {
      "visual-forge": {
        "command": "node",
        "args": ["/path/to/visual-forge-mcp/dist/index.js"]
      }
    }
  }
}
Roo Code (VS Code Extension)

Same as Continue (Roo Code is a fork):

Option A: Using npm package (recommended)

{
  "experimental": {
    "mcpServers": {
      "visual-forge": {
        "command": "npx",
        "args": ["-y", "@michelabboud/visual-forge-mcp"]
      }
    }
  }
}

Option B: Using local source

{
  "experimental": {
    "mcpServers": {
      "visual-forge": {
        "command": "node",
        "args": ["/path/to/visual-forge-mcp/dist/index.js"]
      }
    }
  }
}
Copilot (GitHub Copilot with MCP support)

Configuration via VS Code settings or copilot config file. Consult GitHub Copilot documentation for MCP integration.

Qodo Gen (IDE Extension)

Follow Qodo Gen's MCP integration docs. Typically similar to VS Code extension format.

JetBrains IDEs (IntelliJ, PyCharm, WebStorm, etc.)

Install MCP plugin, then add to IDE settings or .idea/mcp-servers.json:

Option A: Using npm package (recommended)

{
  "mcpServers": {
    "visual-forge": {
      "command": "npx",
      "args": ["-y", "@michelabboud/visual-forge-mcp"]
    }
  }
}

Option B: Using local source

{
  "mcpServers": {
    "visual-forge": {
      "command": "node",
      "args": ["/path/to/visual-forge-mcp/dist/index.js"]
    }
  }
}

🖥️ Terminal & CLI Clients

Aider

Option A: Using npm package (recommended)

aider --mcp-server "npx -y @michelabboud/visual-forge-mcp"

Or add to ~/.aider.conf.yml:

mcp-servers:
  - name: visual-forge
    command: npx
    args:
      - -y
      - @michelabboud/visual-forge-mcp

Option B: Using local source

aider --mcp-server "node /path/to/visual-forge-mcp/dist/index.js"

Or add to ~/.aider.conf.yml:

mcp-servers:
  - name: visual-forge
    command: node
    args:
      - /path/to/visual-forge-mcp/dist/index.js
Goose

Create ~/.config/goose/config.yaml:

Option A: Using npm package (recommended)

mcp_servers:
  visual-forge:
    command: npx
    args:
      - -y
      - @michelabboud/visual-forge-mcp

Option B: Using local source

mcp_servers:
  visual-forge:
    command: node
    args:
      - /path/to/visual-forge-mcp/dist/index.js
Warp Terminal

Settings → MCP Servers → Add Server:

Option A: Using npm package (recommended)

{
  "name": "visual-forge",
  "command": "npx",
  "args": ["-y", "@michelabboud/visual-forge-mcp"]
}

Option B: Using local source

{
  "name": "visual-forge",
  "command": "node",
  "args": ["/path/to/visual-forge-mcp/dist/index.js"]
}
Gemini CLI (Google)

Configuration via CLI flags or config file. Consult Google's Gemini CLI documentation.

opencode

Add to opencode config (typically ~/.opencode/config.json):

Option A: Using npm package (recommended)

{
  "mcpServers": {
    "visual-forge": {
      "command": "npx",
      "args": ["-y", "@michelabboud/visual-forge-mcp"]
    }
  }
}

Option B: Using local source

{
  "mcpServers": {
    "visual-forge": {
      "command": "node",
      "args": ["/path/to/visual-forge-mcp/dist/index.js"]
    }
  }
}
Codex (OpenAI)

Follow OpenAI Codex documentation for MCP server integration.


🎨 Specialized Platforms

Amp (Sourcegraph)

Add to workspace settings or Amp configuration file.

LM Studio

Tools → MCP Servers → Add Server:

Option A: Using npm package (recommended)

{
  "visual-forge": {
    "command": "npx",
    "args": ["-y", "@michelabboud/visual-forge-mcp"]
  }
}

Option B: Using local source

{
  "visual-forge": {
    "command": "node",
    "args": ["/path/to/visual-forge-mcp/dist/index.js"]
  }
}
OpenHands (All Hands AI)

Edit OpenHands config file:

Option A: Using npm package (recommended)

mcp_servers:
  - name: visual-forge
    command: npx
    args:
      - -y
      - @michelabboud/visual-forge-mcp

Option B: Using local source

mcp_servers:
  - name: visual-forge
    command: node
    args:
      - /path/to/visual-forge-mcp/dist/index.js
Factory

Add via Factory's MCP integration settings.

Kiro (AWS)

Add via Kiro IDE settings → MCP Servers.


📝 Text Editors

Neovim

Install MCP plugin (e.g., mcp.nvim), then add to init.lua:

Option A: Using npm package (recommended)

require('mcp').setup({
  servers = {
    ['visual-forge'] = {
      command = 'npx',
      args = { '-y', '@michelabboud/visual-forge-mcp' }
    }
  }
})

Option B: Using local source

require('mcp').setup({
  servers = {
    ['visual-forge'] = {
      command = 'node',
      args = { '/path/to/visual-forge-mcp/dist/index.js' }
    }
  }
})

📖 General MCP Setup Pattern:

Most MCP clients follow this pattern. Choose either npm package or local source:

Option A: Using npm package (recommended)

{
  "mcpServers": {
    "visual-forge": {
      "command": "npx",
      "args": ["-y", "@michelabboud/visual-forge-mcp"],
      "env": {
        "GOOGLE_API_KEY": "your-key-here"
      }
    }
  }
}

Option B: Using local source

{
  "mcpServers": {
    "visual-forge": {
      "command": "node",
      "args": ["/absolute/path/to/visual-forge-mcp/dist/index.js"],
      "env": {
        "GOOGLE_API_KEY": "your-key-here"
      }
    }
  }
}

If your client isn't listed, consult its MCP integration documentation


📖 Detailed Setup Guides

For popular AI coding assistants, we provide comprehensive setup guides:

🎯 Cursor

Complete Cursor Setup Guide →

Features covered:

  • Step-by-step configuration
  • MCP settings (UI and manual)
  • Usage examples with Cursor's context system
  • Composer mode integration
  • Troubleshooting common issues

🔄 Continue (VS Code)

Complete Continue Setup Guide →

Features covered:

  • VS Code extension setup
  • config.json configuration
  • Slash commands with Visual Forge
  • Multi-file operations
  • Custom prompts and workflows

Other clients: Follow the general MCP configuration pattern shown in the Configuration section above.


🔗 n8n Workflow Automation Integration

Visual Forge MCP can be integrated with n8n for automated image generation workflows.

Quick Start with n8n

  1. Start HTTP API Server:
cd visual-forge-mcp
npm run dev:http  # Runs on http://localhost:3000
  1. Choose Integration Method:
    • HTTP API: Use n8n HTTP Request nodes (works immediately)
    • Custom Node: Install n8n-nodes-visual-forge community node (drag-and-drop)
    • Example Workflows: Import pre-built workflow JSON files

Integration Options

Option 1: HTTP API (Immediate)

Use n8n's HTTP Request nodes to call Visual Forge endpoints:

Webhook → HTTP: Start Session → HTTP: Process Files → HTTP: Generate Images → HTTP: Finalize Session

API Endpoints:

  • POST /api/vf/start-session - Create workflow session
  • POST /api/vf/process-files - Add placeholders and parse
  • POST /api/vf/generate-session - Generate all images
  • POST /api/vf/finalize-session - Commit and create PR
  • GET /api/vf/sessions - List all sessions

Option 2: Custom n8n Node (Drag-and-Drop)

Install the Visual Forge community node:

  1. In n8n: Settings > Community Nodes > Install
  2. Enter: n8n-nodes-visual-forge
  3. Restart n8n

Then drag Visual Forge nodes into your workflows with built-in UI for all operations.

Option 3: Example Workflows (Templates)

Import pre-built workflows from n8n/n8n-workflows/:

  • 01-simple-image-generation.json - Basic webhook-triggered generation
  • 02-batch-processing-with-cdn.json - Scheduled batch with S3/CDN upload
  • 03-webhook-triggered-generation.json - Advanced conditional generation

Documentation

Use Cases

  • Automated Documentation: Nightly regeneration of all doc images
  • CI/CD Integration: Generate images on markdown file changes
  • Webhook Triggers: External systems trigger image generation
  • CDN Deployment: Generate and upload to S3/CloudFront
  • Slack Notifications: Alert team when images are generated

✅ Compatibility & Testing Status

Tested & Verified

  • Claude Code - Full functionality tested
  • Claude Desktop - Full functionality tested
  • Zed - Full functionality tested

Should Work (MCP Standard Compliant)

All clients implementing the MCP protocol should work with Visual Forge. The server uses the official @modelcontextprotocol/sdk and follows the MCP specification.

Reported Working: Cursor, Windsurf, Continue, Cline, Aider, Goose

Not Yet Tested: Amp, Copilot, Factory, Gemini CLI, Kiro, LM Studio, opencode, OpenHands, Qodo Gen, Warp, JetBrains IDEs, Neovim, Roo Code, Codex

If you test Visual Forge with any client, please report your results via GitHub Issues!


🚀 Quick Start

Workflow Diagram - Generation Modes

Option 1: Configure via MCP Tools (Recommended)

Once the MCP server is added to your client, you can configure providers directly through MCP tools:

User: "Configure Gemini provider with my API key"
Claude: [Uses configure_provider tool]

User: "Check which providers are configured"
Claude: [Uses get_provider_status tool]

User: "Test if my Gemini connection works"
Claude: [Uses test_provider_connection tool]

Available Configuration Tools:

  • configure_provider - Set API key for a provider
  • get_provider_status - Check which providers have API keys set
  • test_provider_connection - Verify API key works
  • remove_provider - Remove a provider's API key

Option 2: Manual .env Configuration

Create a .env file in the project root:

GOOGLE_API_KEY=AIza...
REPLICATE_API_TOKEN=r8_...

Standalone CLI Usage

# Check available providers
npx tsx scripts/check-providers.ts

# Generate with all providers (comparison)
npx tsx scripts/generate-all-providers.ts

# Generate with specific provider
npx tsx scripts/generate-with-xai.ts

# Generate with custom theme and versioning
npx tsx scripts/generate-solo-theme-test.ts

MCP Tool Usage (via MCP Client)

Once configured, interact with Visual Forge through your MCP client:

User: "Generate a professional image of a DevOps engineer working on cloud infrastructure"

Claude: [Uses visual-forge MCP tools to generate the image]

🛠️ MCP Tools Reference

MCP Tools Architecture

Visual Forge provides the following MCP tools for use through your MCP client:

Configuration Tools

configure_provider

Set API key for an image generation provider.

{
  "provider": "gemini",
  "apiKey": "AIza..."
}

Example: "Configure my Replicate API key: r8_abc123..."

get_provider_status

Check which providers have API keys configured.

{}

Example: "Show me which providers are configured"

Returns:

{
  "success": true,
  "providers": [
    {
      "provider": "gemini",
      "displayName": "Google Gemini",
      "configured": true,
      "keyPreview": "AIza****abc123"
    },
    {
      "provider": "openai",
      "displayName": "OpenAI",
      "configured": false
    }
  ]
}

test_provider_connection

Test if a provider API key is valid by checking availability.

{
  "provider": "gemini"
}

Example: "Test my Gemini connection"

remove_provider

Remove API key for a provider.

{
  "provider": "gemini"
}

Example: "Remove my OpenAI API key"

Model Selection & Testing Tools ✨ NEW v0.9.0

set_default_model

Set the default model for a provider. This model will be used for all future generations unless overridden.

{
  "provider": "zai",
  "modelId": "glm-image"
}

Example: "Set Z.ai to use the GLM-Image model"

Returns:

{
  "success": true,
  "provider": "zai",
  "modelId": "glm-image",
  "modelName": "GLM-Image",
  "message": "Default model set to 'GLM-Image' for Z.ai GLM-Image..."
}

get_model_info

Get detailed information about a specific model, including test results if available.

{
  "provider": "gemini",
  "modelId": "gemini-2.5-flash-image"
}

Example: "What are the details for Gemini Flash Image?"

Returns:

{
  "success": true,
  "provider": "gemini",
  "providerName": "Google Gemini 2.5 Flash Image",
  "model": {
    "id": "gemini-2.5-flash-image",
    "name": "Gemini 2.5 Flash Image",
    "costPerImage": 0.0,
    "description": "Fast, free-tier image generation",
    "capabilities": {
      "maxResolution": "2048x2048",
      "supportedAspectRatios": ["1:1", "16:9", "4:3", "9:16"]
    }
  },
  "testResult": {
    "testedAt": "2026-01-16T10:30:00.000Z",
    "qualityScore": 85.5,
    "passed": true
  },
  "message": "Model tested on 1/16/2026 with quality score 85.5/100"
}

test_model

Test a model with either standard test (automated) or custom prompt (user-provided). Records quality score for future reference.

Standard Test (Automated Validation):

{
  "provider": "zai",
  "modelId": "glm-image",
  "useStandardTest": true,
  "aspectRatio": "16:9"
}

Custom Prompt Test (Real Use Case):

{
  "provider": "gemini",
  "modelId": "gemini-2.5-flash-image",
  "prompt": "AWS VPC architecture diagram showing public/private subnets, NAT gateway, and EC2 instances",
  "aspectRatio": "16:9"
}

Example: "Test the Z.ai GLM-Image model with a standard quality test"

Returns:

{
  "success": true,
  "provider": "zai",
  "providerName": "Z.ai GLM-Image",
  "model": "GLM-Image",
  "testImage": {
    "filepath": "generated-images/tests/zai-glm-image-test.png",
    "generationTime": 12000,
    "actualCost": 0.015
  },
  "qualityScore": {
    "overall": 87.5,
    "sharpness": 89.2,
    "brightness": 145,
    "textRendering": 85.0,
    "colorAccuracy": 90.0,
    "passed": true
  },
  "message": "Model test passed! Quality score: 87.5/100. Model is ready for production use."
}

Quality Metrics:

  • Sharpness (30%): Laplacian variance analysis for edge detection
  • Brightness (20%): Average brightness in optimal range (30-240)
  • Text Rendering (40%): Estimated OCR accuracy and text clarity
  • Color Accuracy (10%): Heuristic-based color validation
  • Overall Score: Weighted average, pass threshold 60/100

compare_models

Compare multiple providers/models side-by-side with the same prompt. Generates quality scores and recommendation.

{
  "prompt": "Technical diagram showing microservices architecture with API gateway, service mesh, and databases",
  "providers": [
    { "provider": "gemini", "model": "gemini-2.5-flash-image" },
    { "provider": "zai", "model": "glm-image" },
    { "provider": "huggingface", "model": "black-forest-labs/FLUX.1-dev" }
  ],
  "aspectRatio": "16:9"
}

Example: "Compare Gemini, Z.ai, and HuggingFace FLUX models for generating a microservices architecture diagram"

Returns:

{
  "success": true,
  "prompt": "Technical diagram showing microservices architecture...",
  "totalCost": 0.015,
  "totalTime": 35000,
  "results": [
    {
      "provider": "zai",
      "model": "GLM-Image",
      "qualityScore": { "overall": 92.1, "textRendering": 95.8 },
      "cost": 0.015,
      "rank": 1
    },
    {
      "provider": "gemini",
      "model": "Gemini Flash Image",
      "qualityScore": { "overall": 85.5 },
      "cost": 0.0,
      "rank": 2
    },
    {
      "provider": "huggingface",
      "model": "FLUX.1-dev",
      "qualityScore": { "overall": 78.3 },
      "cost": 0.0,
      "rank": 3
    }
  ],
  "recommendation": {
    "provider": "zai",
    "model": "glm-image",
    "reason": "Highest overall quality (92.1/100), especially excellent text rendering (95.8/100). Worth the $0.015 cost for technical diagrams.",
    "alternatives": [
      {
        "provider": "gemini",
        "model": "gemini-2.5-flash-image",
        "reason": "Free alternative with good quality (85.5/100)"
      }
    ]
  },
  "message": "Comparison complete! zai glm-image scored highest (rank 1)..."
}

Image Generation Tools

parse_markdown

Parse markdown files to extract image specifications.

{
  "filePaths": ["./docs/guide.md"]
}

list_providers

List all available image generation providers and their capabilities.

{}

generate_image

Generate a single image from specification.

{
  "imageId": "chapter-01-img-01",
  "provider": "gemini"
}

list_images

List all parsed image specifications.

{
  "filter": "diagram"
}

Workflow Tools

start_workflow

Start image generation workflow with specified mode.

{
  "mode": "interactive",
  "provider": "gemini",
  "imageIds": ["img-01", "img-02"]
}

Modes: interactive, batch, bulk

get_status

Get current workflow status and progress.

{}

pause_workflow / resume_workflow

Control workflow execution.

{}

Cost Tracking Tools

get_cost_summary

Get cost summary for all generated images by provider.

{}

📖 Usage Examples

Example 1: Generate Single Image with Specific Provider

// generate-example.ts
import { providerFactory } from './src/providers/index.js';
import { ImageSpec } from './src/types/index.js';

providerFactory.initialize();
const provider = providerFactory.getProvider('gemini');

const spec: ImageSpec = {
  id: 'my-image-01',
  prompt: 'A modern cloud architecture diagram showing microservices...',
  aspectRatio: '16:9',
  // ... other fields
};

const result = await provider.generate(spec);
console.log(`Generated: ${result.filepath}`);

Example 2: Generate with Multiple Versions

# Generate 2 versions of each image for comparison
npx tsx generate-solo-theme-test.ts

Example 3: Cost Estimation

const gemini = providerFactory.getProvider('gemini');
const estimatedCost = gemini.estimateCost(spec);
console.log(`Estimated cost: $${estimatedCost}`);

🎨 Supported Providers

AI Provider Comparison

| Provider | Model | Cost/Image | Speed | Quality | Status | |----------|-------|-----------|-------|---------|--------| | Google Gemini | 2.5 Flash Image | $0.00 | ⚡⚡⚡ Fast | Excellent | ✅ Tested | | Replicate | FLUX Schnell | $0.003 | ⚡⚡⚡ Fast | Excellent | ⭐ Recommended | | Leonardo | Phoenix | $0.02 | ⚡⚡ Medium | Excellent | ✅ Ready | | Stability AI | SDXL 1.0 | $0.04 | ⚡⚡ Medium | Excellent | ✅ Tested | | OpenAI | GPT Image 1 | $0.04-0.12 | ⚡ Medium | Best | ✅ Fixed | | xAI | Grok 2 Image | $0.07 | ⚡⚡ Fast | Good | ⚠️ Preview | | HuggingFace | Various | $0.00 | ⚡ Slow | Varies | 🔧 Beta |

Cost Comparison (151 images)

| Provider | Total Cost | Savings vs OpenAI HD | |----------|-----------|---------------------| | Replicate FLUX | $0.45 | 96% cheaper ⭐ | | Leonardo | $3.02 | 75% cheaper | | Gemini | $5.89 | 51% cheaper | | Stability AI | $6.04 | 50% cheaper | | OpenAI Standard | $6.04 | Baseline | | xAI Grok | $10.57 | 12% cheaper | | OpenAI HD | $12.08 | Baseline premium |


🏗️ Project Structure

Visual Forge MCP Architecture

visual-forge-mcp/
├── src/                    # Source code
│   ├── providers/          # AI provider implementations
│   │   ├── openai/         # OpenAI GPT Image
│   │   ├── gemini/         # Google Gemini 2.5 Flash
│   │   ├── stability/      # Stability AI SDXL
│   │   ├── replicate/      # Replicate FLUX
│   │   ├── leonardo/       # Leonardo Phoenix
│   │   ├── huggingface/    # HuggingFace Inference
│   │   ├── xai/            # xAI Grok 2 Image
│   │   └── base-provider.ts
│   ├── quality/            # Quality inspection module
│   ├── state/              # State management
│   ├── types/              # TypeScript definitions
│   └── utils/              # Utilities (logger, rate limiter)
│
├── scripts/                # Example & test scripts
│   ├── check-providers.ts  # Provider status checker
│   ├── generate-with-*.ts  # Provider-specific generators
│   ├── generate-all-providers.ts  # Multi-provider comparison
│   └── generate-solo-theme-test.ts  # Versioning test
│
├── docs/                   # Documentation
│   ├── VISUAL_MCP_SERVER.md  # Server specification
│   ├── IMPLEMENTATION-PLAN.md  # Architecture plan
│   ├── SESSION_SUMMARY.md  # Development notes
│   └── example.md          # Example prompts
│
├── test/                   # Test files
│   └── test-run.ts
│
├── dist/                   # Build output (gitignored)
├── generated-images/       # Generated images (gitignored)
│   ├── gemini/             # Gemini generated images
│   ├── openai/             # OpenAI generated images
│   ├── stability/          # Stability AI generated images
│   └── ...                 # Other providers
│
├── package.json            # NPM package configuration
├── tsconfig.json           # TypeScript configuration
├── README.md               # This file
├── CHANGELOG.md            # Version history
└── .env.example            # Environment variables template

🔧 Development

Build

npm run build

Development Mode

npm run dev

Linting

npm run lint

Testing

npm test
npm run test:watch

📊 Features in Detail

Quality Inspection

Automatically analyzes generated images for:

  • Sharpness: Laplacian variance algorithm
  • Brightness: 0-255 scale with warnings for extremes
  • File size: Validation (10KB - 10MB)
  • Dimensions: Aspect ratio verification
  • Format: Image encoding validation

Rate Limiting

Token bucket algorithm per provider:

  • OpenAI: 5 images/minute
  • Gemini: 20 images/minute
  • Stability AI: 10 images/minute
  • Replicate: 20 images/minute
  • xAI: 10 images/minute

Cost Tracking

Real-time tracking:

  • Per-image cost
  • Provider totals
  • Session costs
  • Estimates before generation

🐛 Troubleshooting

Provider Not Available

❌ Gemini provider not available! Check GOOGLE_API_KEY in .env

Solution: Verify API key is set in .env file and restart the server.

Out of Credits (Stability AI)

❌ insufficient_balance: Your organization does not have enough balance

Solution: Add credits to your Stability AI account at https://platform.stability.ai

Rate Limit Errors

❌ Rate limit exceeded for provider openai

Solution: The rate limiter will automatically retry after cooldown. Reduce concurrency in workflow config.


🛠️ Development & Publishing

NPM Package Management

Visual Forge MCP includes a production-ready script for managing the npm package lifecycle with comprehensive safety checks and verbose logging.

Script Location: scripts/npm-manager.sh

Features

  • Publish with safety checks - 10 pre-publish validations
  • Version management - Bump patch/minor/major
  • Authentication support - NPM tokens and recovery codes
  • Git integration - Automatic tag creation
  • Dry-run mode - Test without making changes
  • Package info & stats - View registry information
  • Security audits - Check for vulnerabilities
  • Access management - Control package access
  • Comprehensive logging - All operations logged

Quick Start

# Show help
./scripts/npm-manager.sh --help

# Show package info
./scripts/npm-manager.sh info

# Bump version
./scripts/npm-manager.sh bump patch

# Publish (with all safety checks)
./scripts/npm-manager.sh publish --dry-run
./scripts/npm-manager.sh publish

Complete Release Workflow

# 1. Bump version
./scripts/npm-manager.sh bump patch

# 2. Update CHANGELOG.md
vim CHANGELOG.md

# 3. Commit changes
git commit -am "chore: Release v$(node -pe "require('./package.json').version")"

# 4. Test publish
./scripts/npm-manager.sh publish --dry-run

# 5. Publish to npm
./scripts/npm-manager.sh publish

# 6. Push to GitHub
git push && git push --tags

Authentication

With NPM Token (recommended for CI/CD):

export NPM_TOKEN=npm_your_token_here
./scripts/npm-manager.sh publish

With Recovery Codes:

npm login --auth-type=legacy
./scripts/npm-manager.sh publish

Documentation


🤝 Contributing

Contributions welcome! Please read CONTRIBUTING.md first.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📝 License

Personal Use Only - This software is free for personal, educational, and research purposes only.

Commercial use is NOT permitted without a commercial license.

See LICENSE file for complete terms and conditions.

For commercial licensing inquiries, please contact the author.


🙏 Acknowledgments

  • Built with MCP SDK
  • Powered by multiple AI image generation providers
  • Special thanks to the open-source community

📚 Resources

Documentation

Setup Guides

n8n Integration


🔗 Links

  • Repository: https://github.com/michelabboud/visual-forge-mcp
  • Issues: https://github.com/michelabboud/visual-forge-mcp/issues
  • MCP Documentation: https://modelcontextprotocol.io

⚡ Generate professional images for your documentation in seconds!

Made with ❤️ for the developer community