npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

vision-generator-mcp

v0.1.0

Published

Local MCP server for image and video generation via OpenAI-compatible providers

Readme

🎨 Vision Generator MCP

Local-first MCP server for image & video generation through OpenAI-compatible providers

Discover models automatically · generate media locally · save outputs explicitly · avoid context bloat


🚀 Why this exists

Most image/video APIs force clients to understand:

  • different endpoints
  • inconsistent payloads
  • mixed sync/async behavior
  • provider-specific output handling
  • confusing model lists with text-only models mixed in

Vision Generator MCP gives you one local MCP layer that:

  • auto-discovers models from the provider
  • filters only image/video-capable models
  • normalizes generation flows
  • writes outputs to a folder you choose
  • keeps chat context clean by not returning huge base64 images

✨ Highlights

| Feature | What you get | |---|---| | Auto model discovery | Uses GET /models at runtime | | Vision-only filtering | Hides irrelevant text-only models via buildVisionModelRegistry() | | Required output folder | Every generated asset has a clear final_path | | Async video flow | Submit + poll video jobs cleanly | | Local-first workflow | Great for desktop / VS Code / Claude workflows | | Configurable timeouts | Provider and download timeouts are configurable from MCP settings | | Modular architecture | Clean separation across src/providers/, src/core/, src/tools/, src/validation/ |


🧭 How it works

┌───────────────────────────────────────────────┐
│ MCP Client / Agent                            │
│ Claude / VS Code / Desktop / Local workflow   │
└───────────────────────┬───────────────────────┘
                        │
                        │ MCP tools
                        ▼
┌───────────────────────────────────────────────┐
│ Vision Generator MCP                          │
│-----------------------------------------------│
│ Tool handlers                                 │
│ Validation                                    │
│ Vision service                                │
│ Model discovery                               │
│ Capability filtering                          │
│ Output publishing                             │
└───────────────────────┬───────────────────────┘
                        │
                        │ Adapter abstraction
                        ▼
┌───────────────────────────────────────────────┐
│ OpenAI-compatible provider adapter            │
└───────────────────────┬───────────────────────┘
                        │
                        │ HTTP
                        ▼
┌───────────────────────────────────────────────┐
│ Provider API                                  │
│ /models                                       │
│ /images/generations                           │
│ /images/edits                                 │
│ /videos/generations                           │
└───────────────────────────────────────────────┘

🧱 Project structure

.
├─ README.md
├─ package.json
├─ tsconfig.json
├─ plans/
│  └─ mcp-image-video-architecture-plan.md
├─ src/
│  ├─ index.ts
│  ├─ config/
│  │  └─ providers.ts
│  ├─ core/
│  │  ├─ errors.ts
│  │  ├─ file-output-publisher.ts
│  │  ├─ model-discovery.ts
│  │  └─ vision-service.ts
│  ├─ providers/
│  │  ├─ base-provider.ts
│  │  ├─ openai-compatible.adapter.ts
│  │  └─ provider-factory.ts
│  ├─ tools/
│  │  ├─ animate-image.ts
│  │  ├─ edit-image.ts
│  │  ├─ generate-image.ts
│  │  ├─ generate-video.ts
│  │  ├─ get-job-status.ts
│  │  ├─ get-model-capabilities.ts
│  │  └─ list-models.ts
│  ├─ types/
│  │  └─ contracts.ts
│  ├─ utils/
│  │  ├─ mime.ts
│  │  └─ path.ts
│  └─ validation/
│     ├─ common.ts
│     ├─ image.ts
│     ├─ job.ts
│     ├─ output.ts
│     ├─ schemas.ts
│     └─ video.ts
└─ outputs/

✅ Current supported workflow

Image

Video

Discovery


📦 Installation

Requirements

  • Node.js 18+
  • npm
  • an OpenAI-compatible provider endpoint

Install from npm

npm install -g vision-generator-mcp

Run installed binary

vision-generator-mcp

Local development install

npm install

Type-check

npm run check

Build

npm run build

Publishable package notes

  • CLI entry is exposed via bin
  • installable package files are limited via files
  • build runs automatically before publish/install from source via prepare

⚙️ MCP settings

This server reads configuration from MCP settings using:

  • PROVIDER_BASE_URL
  • PROVIDER_API_KEY
  • PROVIDER_TIMEOUT_MS
  • DOWNLOAD_TIMEOUT_MS

Example configuration:

{
  "mcpServers": {
    "vision-generator": {
      "command": "node",
      "args": [
        "d:/All_project/own/AI_Coder/Native Tools/vision-generator/build/index.js"
      ],
      "disabled": false,
      "timeout": 600,
      "alwaysAllow": [],
      "disabledTools": [],
      "env": {
        "PROVIDER_BASE_URL": "https://ai.rayzs.qzz.io/v1",
        "PROVIDER_API_KEY": "your-api-key-1",
        "PROVIDER_TIMEOUT_MS": "300000",
        "DOWNLOAD_TIMEOUT_MS": "300000"
      }
    }
  }
}

Timeout layers

| Timeout | Scope | |---|---| | timeout in MCP settings | How long the MCP host waits for the server tool call | | PROVIDER_TIMEOUT_MS | Timeout for provider API requests | | DOWNLOAD_TIMEOUT_MS | Timeout for binary asset download |

Current provider timeout config is loaded in loadProviderConfig() and applied in OpenAICompatibleAdapter.


📁 Output strategy

output.directory is required for image and video tools.

Recommended folders:

  • outputs/
  • d:/All_project/own/AI_Coder/Native Tools/vision-generator/outputs

Why this is better:

  • every result has a clear location
  • no hidden temp output behavior
  • no base64 image spam in chat context
  • much better for GitHub-friendly, local-first workflows

🛠️ Tool reference

list_models

Discover image/video-capable models.

Example output

{
  "provider": "https://ai.rayzs.qzz.io/v1",
  "models": [
    {
      "id": "gpt-image-2",
      "operations": {
        "image_generation": true,
        "image_editing": true,
        "image_variation": false,
        "text_to_video": false,
        "image_to_video": false
      }
    }
  ]
}

get_model_capabilities

Inspect a discovered model.

Example input

{
  "model": "gpt-image-2"
}

generate_image

Generate an image and write it to your chosen folder.

Example input

{
  "model": "gpt-image-2",
  "prompt": "A futuristic Jakarta skyline at sunset, cinematic lighting",
  "aspect_ratio": "16:9",
  "resolution": "1536x1024",
  "output": {
    "directory": "d:/All_project/own/AI_Coder/Native Tools/vision-generator/outputs",
    "filename_prefix": "jakarta-future-city",
    "create_directory": true
  }
}

Example output

{
  "status": "succeeded",
  "provider": "https://ai.rayzs.qzz.io/v1",
  "model": "gpt-image-2",
  "operation": "image_generation",
  "outputs": [
    {
      "type": "image",
      "mime_type": "image/png",
      "final_path": "d:/All_project/own/AI_Coder/Native Tools/vision-generator/outputs/jakarta-future-city_2026-05-19T01-00-00-000Z.png",
      "width": 1536,
      "height": 1024
    }
  ]
}

edit_image

Edit a local image and save the output.

Example input

{
  "model": "gpt-image-2",
  "prompt": "Replace the background with a neon cyberpunk street",
  "image_path": "d:/assets/input.png",
  "output": {
    "directory": "d:/All_project/own/AI_Coder/Native Tools/vision-generator/outputs",
    "filename_prefix": "edited-scene",
    "create_directory": true
  }
}

generate_video

Submit an async text-to-video job.

Example input

{
  "model": "your-video-model",
  "prompt": "A cinematic aerial shot flying over a futuristic city",
  "duration_seconds": 5,
  "fps": 24,
  "output": {
    "directory": "d:/All_project/own/AI_Coder/Native Tools/vision-generator/outputs",
    "filename_prefix": "future-city-video",
    "create_directory": true
  }
}

Example submit result

{
  "status": "submitted",
  "provider": "https://ai.rayzs.qzz.io/v1",
  "model": "your-video-model",
  "operation": "text_to_video",
  "job_id": "video_...",
  "provider_job_id": "provider_...",
  "outputs": []
}

animate_image

Submit an async image-to-video job.

get_job_status

Poll video job status until the final file is downloaded and written to your chosen folder.


🧩 Implementation map

| Concern | Entry point | |---|---| | Composition root | src/index.ts | | Main orchestration | src/core/vision-service.ts | | Provider contract | src/providers/base-provider.ts | | OpenAI-compatible provider | src/providers/openai-compatible.adapter.ts | | Adapter selection | src/providers/provider-factory.ts | | Model discovery | src/core/model-discovery.ts | | File output | src/core/file-output-publisher.ts | | Validation layer | src/validation/ | | Tool handlers | src/tools/ | | Utilities | src/utils/ |


🔮 Future provider list

Easiest next additions

  • more OpenAI-compatible gateways
  • provider-specific quirks layer

Best next adapters

Later / higher-effort adapters

These are roadmap targets, not currently implemented files.


🧪 Development workflow

npm install
npm run check
npm run build

After changing MCP settings or rebuilding:

  • reload the MCP runtime / extension
  • start a fresh session if needed

✅ Project status

  • local MCP server implemented
  • OpenAI-compatible provider adapter implemented
  • modular structure aligned with the plan
  • explicit output directory required
  • configurable provider/download timeout support added
  • no image base64 context bloat
  • build verified
  • ready for runtime MCP usage after MCP reload