pixelbridge-mcp

v0.1.3

Published

21 days ago

Local MCP server and Chrome extension for ChatGPT browser-based image generation workflows.

0High
0Medium
0Low

douglasotto

pixelbridge mcp model-context-protocol chatgpt chrome-extension image-generation browser-automation

PixelBridge MCP

PixelBridge MCP is a local TypeScript MCP server and Chrome extension bundle that orchestrates ChatGPT image generation from the user's real browser tab, persists artifacts under artifacts/generated-images/chatgpt/, and returns MCP-friendly structured results.

License: MIT. See LICENSE.

For productization, the intended install target is now a local npm-installed CLI. The executable name is pixelbridge-mcp.

Quick Install

Install the published CLI:

npm install -g pixelbridge-mcp

Then start the server:

pixelbridge-mcp

You also need to load the packaged Chrome extension from the installed package's extension/ directory. For the full install path, see Installing The CLI.

End-user installation docs:

The MCP client guide includes concrete setup examples for:

VS Code / GitHub Copilot in VS Code
Cursor
Windsurf
JetBrains IDEs
OpenAI Codex CLI
OpenAI Codex IDE/app surfaces
Claude Code
generic stdio MCP clients

Additional user-facing docs now cover:

clean-machine installation verification
Windows/macOS/Linux config snippets
common failure modes and answers

Public Links

Current public project/support URL:

Project home: https://github.com/DouglasOttoDavila/image-generation-mcp-server
Issue tracker: https://github.com/DouglasOttoDavila/image-generation-mcp-server/issues
Support page: https://github.com/DouglasOttoDavila/image-generation-mcp-server/blob/main/docs/support.md

Current public privacy-policy draft URL:

https://github.com/DouglasOttoDavila/image-generation-mcp-server/blob/main/docs/privacy-policy.md

If you later publish GitHub Pages for this repo, replace the privacy-policy URL above with the final Pages URL and reuse the same link in the Chrome Web Store listing.

What It Does

Exposes MCP tools for active-tab image generation, fresh-chat image generation, existing-GPT image generation, browser-session validation, and tab diagnostics.
Hosts a localhost extension bridge so a ChatGPT content script can register already-open tabs.
Persists generated images plus metadata sidecars under artifacts/generated-images/chatgpt/<YYYY-MM-DD>/.

Important Limitation

This server depends on the Chrome extension being loaded into a real logged-in chatgpt.com tab plus ChatGPT's live DOM remaining compatible with the current selectors. That means end-to-end behavior is environment-dependent and should be treated as browser automation, not as a stable API integration.

Current migration state:

validate_browser_session(), list_connected_chatgpt_tabs(), select_chatgpt_tab(), generate_image_active_tab(...), generate_image_new_chat(...), and generate_image_with_gpt(...) now use the extension bridge.
generate_image_with_gpt(...) does not navigate the GPT directory in MVP. It verifies that the selected tab is already on the requested GPT page.
list_matching_gpts(...) still uses the legacy browser automation path.

Setup

This section is the developer/source-repo setup path. If you are installing the packaged CLI, use Installing The CLI instead.

Install dependencies:

npm install

Set environment variables:

$env:PIXELBRIDGE_EXTENSION_BRIDGE_HOST="127.0.0.1"
$env:PIXELBRIDGE_EXTENSION_BRIDGE_PORT="47821"

You can also place the same values in a local .env file. The server auto-loads .env on startup and accepts either standard dotenv lines or PowerShell-style $env: lines.

A safe starter file is included at .env.example.

Optional variables:

PIXELBRIDGE_RUNTIME_HOME
PIXELBRIDGE_EXTENSION_BRIDGE_PORT
PIXELBRIDGE_EXTENSION_BRIDGE_HOST
PIXELBRIDGE_ARTIFACT_ROOT
PIXELBRIDGE_RETURN_MODE with paths or base64
PIXELBRIDGE_DEFAULT_TIMEOUT_MS
PIXELBRIDGE_MAX_TIMEOUT_MS
PIXELBRIDGE_RETRY_ATTEMPTS
PIXELBRIDGE_RETRY_BASE_DELAY_MS

Legacy CHATGPT_* environment variables are still accepted for backward compatibility.

Runtime Home

When PIXELBRIDGE_RUNTIME_HOME is not set, the server now defaults to an OS app-data directory instead of the repo folder:

Windows: %LOCALAPPDATA%\\pixelbridge-mcp
macOS: ~/Library/Application Support/pixelbridge-mcp
Linux: ${XDG_DATA_HOME:-~/.local/share}/pixelbridge-mcp

Default artifact output is created under:

<runtime-home>/artifacts/generated-images/chatgpt/

The local bridge token and managed browser profile data also live under the runtime home.

Packaging Notes

The npm package is intended to ship runtime artifacts only:

dist/src/*
extension runtime files under extension/dist/
extension/manifest.json
extension/popup.html
extension/options.html
runtime README files

It should not ship compiled tests or raw extension TypeScript source files.

Current extension distribution model:

downloadable/manual install first
load the bundled extension/ directory as an unpacked extension
Chrome Web Store submission is documented but not the default distribution path yet

Extension Bridge

Build the extension:

npm run build:extension

Then load extension/ as an unpacked extension in Chrome. The extension uses the local handshake endpoint:

http://127.0.0.1:47821/handshake

See extension/README.md for the current MVP behavior.

First Login

Open https://chatgpt.com in your normal Chrome profile, log in once, and keep the unpacked extension enabled for that tab. The server does not need to launch or own the browser profile anymore for extension-backed tools.

Running

npm run build
npm run start

For an installed CLI, run:

pixelbridge-mcp

For generic MCP client wiring examples, see MCP Client Configuration.

For local development:

npm run dev

MCP Tools

validate_browser_session()
list_connected_chatgpt_tabs()
select_chatgpt_tab(tabKey)
generate_image_active_tab(prompt, returnMode?, timeoutMs?)
generate_image_new_chat(prompt, returnMode?, timeoutMs?)
generate_image_with_gpt(gptName, prompt, returnMode?, timeoutMs?)
list_matching_gpts(query)

Artifact Layout

Successful runs save files under:

<runtime-home>/artifacts/generated-images/chatgpt/YYYY-MM-DD/

Each run writes:

one or more image files
one <run-id>.metadata.json file with prompt, GPT input, resolved GPT, timestamps, backend usage, and saved paths

Manual Verification

Load the unpacked extension and open https://chatgpt.com.
Run list_connected_chatgpt_tabs() and confirm at least one ChatGPT tab is registered.
If multiple ChatGPT tabs are connected through the extension, use select_chatgpt_tab(tabKey).
Run validate_browser_session() and confirm it reports authenticated: true.
Run generate_image_active_tab("A clean studio product photo of a ceramic mug").
Confirm an image file and metadata JSON appear under artifacts/generated-images/chatgpt/<today>/.
Run generate_image_new_chat("A clean studio product photo of a ceramic mug") and confirm it starts a fresh chat before generating.
Open a specific GPT manually, then run generate_image_with_gpt("Your GPT Name", "A cinematic concept sketch").
Run list_matching_gpts("Dall e") only if you still want the legacy GPT directory lookup path.

Testing

npm test

The automated tests cover config parsing, dotenv loading, fuzzy GPT matching, artifact persistence, response formatting, backend routing, run locking, and validation error propagation. Live ChatGPT automation remains manual because the consumer UI is brittle and environment-dependent.