opencode-vision-paste
v0.3.0
Published
OpenCode plugin: intercept pasted images → local VL API analysis → replace with text
Maintainers
Readme
opencode-vision-paste
OpenCode plugin: intercept pasted images → local VL API analysis → replace with text
English | 简体中文 | 繁體中文 | 日本語 | 한국어 | Español | Français | Deutsch | Русский | Português
You paste an image into your OpenCode session. The LLM can't see it. The plugin can.
opencode-vision-paste intercepts pasted images, sends them to a local VL API (e.g., Qwen3VL running via llama.cpp), and replaces the image with a detailed text description — seamlessly, before the LLM ever sees the message.
No cloud dependencies. No API keys required. Your images stay local.
Features
- Transparent — images are silently converted to text. The LLM never knows there was an image
- Zero dependencies — single ESM file, no build step, no npm deps
- Multi-image support — paste multiple images at once; each is analyzed and numbered
- Local-first — works with any OpenAI-compatible VL API (llama.cpp, vLLM, etc.)
- Smart caching — reuses analysis results for duplicate images across conversation turns
- HTTP & data URLs — handles both pasted images (data: URLs) and web-hosted images
- Smart skip — automatically detects when the current chat model natively supports images and skips interception
Installation
For Humans
Copy and paste this prompt to your OpenCode agent:
Install and configure opencode-vision-paste by following the instructions here:
https://raw.githubusercontent.com/wsaaaqqq/opencode-vision-paste/main/readme/INSTALL.mdFor LLM Agents
curl -s https://raw.githubusercontent.com/wsaaaqqq/opencode-vision-paste/main/readme/INSTALL.mdOr read the Installation Guide directly.
Quick Start
- Install the plugin (see Installation above)
- Configure your VL API endpoint in
.opencode/vision-paste.config.jsonc(optional — defaults work if you have a local llama.cpp running) - Paste an image in any OpenCode chat — watch it turn into text
Configuration
All settings are optional. The plugin works with sensible defaults.
| Option | Default | Description |
|--------|---------|-------------|
| apiBaseUrl | http://192.168.9.44:5678/v1 | OpenAI-compatible VL API endpoint |
| apiModel | Qwen3VL-8B-Instruct-Q4_K_M.gguf | Model name for the VL API |
| apiKey | "" | API key (leave empty if not required) |
| promptTemplate | "" | Custom prompt (overrides promptLocale); {userText} is replaced with the user's message |
| promptLocale | zh | Built-in prompt language: en, zh, ja, ko, es, fr, de, ru, pt |
| skipIfModelSupportsVision | true | Skip interception when the current chat model natively supports images |
| visionModels | [] | Extra model ID patterns to treat as vision-capable (case-insensitive) |
| healthCheckOnStart | true | Ping VL API on session start; warn if unreachable |
| verbose | false | Show analysis progress in chat messages |
| errorHints | true | Include troubleshooting suggestions in error messages |
Config file locations (first found wins):
.opencode/vision-paste.config.jsonc(project-level)~/.config/opencode/vision-paste.config.jsonc(user-level)
Full reference: CONFIGURATION.md
How it works
User pastes image
↓
opencode-vision-paste intercepts `experimental.chat.messages.transform`
↓
Decodes image (data URL or HTTP) → saves temp file
↓
Sends to local VL API (OpenAI-compatible chat completions)
↓
Replaces image part with analysis text
↓
Temp file deleted — LLM sees text onlyThe plugin hooks into OpenCode's experimental.chat.messages.transform pipeline, running before the message is sent to the LLM.
VL API Setup
The plugin needs a running VL API. Choose one:
Docker (one command)
# 1. Download the model
mkdir models
wget -P models https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct-GGUF/resolve/main/Qwen3VL-8B-Instruct-Q4_K_M.gguf
# 2. Start the server
docker compose up -dOllama
ollama pull qwen2.5-vl:7b
# Then configure apiBaseUrl: http://localhost:11434/v1See examples/ for configuration presets.
CLI
npx opencode-vision-paste init # Interactive setup wizard
npx opencode-vision-paste doctor # Diagnose plugin + VL API
npx opencode-vision-paste config # Show current configurationFAQ
Q: Images are not being analyzed?
Run npx opencode-vision-paste doctor to check your setup. Common issues: VL API not running, wrong URL, or model not loaded.
Q: How do I use a different VL API backend?
Update apiBaseUrl in .opencode/vision-paste.config.jsonc. Any OpenAI-compatible endpoint works (Ollama, vLLM, llama.cpp, LM Studio, etc.).
Q: Can I force the plugin to always process images?
Set "skipIfModelSupportsVision": false in your config.
Q: The prompt is in Chinese, can I change it?
Set "promptLocale": "en" or provide a custom "promptTemplate".
Contributing
PRs welcome! See CONTRIBUTING.md.
