opencode-whatsapp

v0.3.1

Published

2 months ago

WhatsApp gateway for OpenCode — talk to AI from WhatsApp

0High
0Medium
0Low

eizo__void

whatsapp messaging gateway opencode ai claude baileys

OpenCode WhatsApp Gateway

Connect your WhatsApp phone to OpenCode AI. Send messages to chat with Claude, manage sessions, switch models, and monitor gateway status — all from WhatsApp.

Status: ✅ Production Ready | Last Updated: 2026-04-04 | Version: 0.3.0

Install

npm install -g opencode-whatsapp

Or as an OpenCode plugin:

opencode plugin opencode-whatsapp

Quick Start

1. Configure

opencode-whatsapp config

Walks you through:

Phone number linking (QR scan)
Phone type (personal or dedicated)
Access policy (allowlist, pairing, open, disabled)
Optional command prefix

2. Run

opencode-whatsapp run

Connects to WhatsApp. Talks to OpenCode directly (reads config, DB, and AI providers). Ready to receive messages.

3. Use

Send from WhatsApp's "Message yourself" chat:

/ping → Health check
/help → Show all commands
/sesslist → List sessions
/new → Create session
/sess 1 → Pin session
Any message → Routes to AI

How It Works

┌─────────────────────────────────────────────────────────────┐
│ Your WhatsApp Phone (Linked Device)                         │
│ Send: "What is quantum computing?"                          │
└─────────────────────────────────────────────────────────────┘
                          ↓ (WhatsApp API)
┌─────────────────────────────────────────────────────────────┐
│ Baileys WebSocket (WhatsApp Web Protocol)                   │
│ · Receives message as proto.IWebMessageInfo                 │
│ · Handles media files (images, PDFs, audio, video)          │
│ · Maintains connection with Linked Device                   │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Gateway: extensions/whatsapp/src/gateway.ts                 │
│ · Handles connection lifecycle                              │
│ · Manages reconnection (3s backoff)                          │
│ · Queues incoming messages                                  │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Monitor: extensions/whatsapp/src/inbound/monitor.ts          │
│ · Extract text from message                                 │
│ · Download & convert media (base64 data URL)                │
│ · Parse location & reply context                            │
│ · Deduplicate messages                                      │
│ · Create WebInboundMessage object                           │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Handler: packages/opencode/src/cli/cmd/whatsapp.ts          │
│ · Detect control commands (/help, /status, /sess, etc)      │
│ · Handle self-chat only (safety)                            │
│ · Check if session is pinned                                │
│ · Route to AI if regular message                            │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ SessionPrompt.prompt()                                      │
│ · Queue message (serialized, no parallel)                   │
│ · Send with: sessionID, text, media, model                  │
│ · channel="whatsapp" (bot knows source)                     │
│ · Listen for quota errors (abort if needed)                 │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Claude AI Model                                             │
│ · Receives: text message + optional media                   │
│ · Processes: image analysis, document extraction, etc.      │
│ · Responds: natural language reply                          │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Response Handler                                            │
│ · Extract text from response                                │
│ · Track model used (for /status)                            │
│ · Calculate elapsed time                                    │
│ · Format via ResponseFormatter                              │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Send Reply: msg.reply(text)                                 │
│ · sock.sendMessage(jid, { text })                           │
│ · Splits long responses (4000 char limit)                   │
│ · Sends back to WhatsApp                                    │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ Your WhatsApp Phone                                         │
│ Notification: "Claude: Quantum computing is..."             │
└─────────────────────────────────────────────────────────────┘

Features

🤖 AI Integration

Claude AI: Access OpenCode's configured AI models
Media support: Send images, PDFs, audio, videos
Context: Multi-turn conversations maintained in sessions
Channel-aware: AI knows it's WhatsApp (can tailor responses)

📋 Session Management

Multiple sessions: Create unlimited conversations
Pin/switch: /sess <n> to switch between sessions
Session titles: Each session remembers its name
Persistent: Survives gateway restarts
Database: All history stored locally

🎯 Model Selection

Model list: /modellist shows all available models
Per-session models: Each session can use different model
Free & paid: Mix of free (Claude, GPT-4o) and paid models
Quick switch: /model <n> to change model immediately
Fallback: Uses default if selected unavailable

💬 Control Commands

10 commands total: All self-chat only (safety)
Help system: /help explains every command
Status monitoring: /status shows config + elapsed time
Operation control: /stop to abort long operations
Health check: /ping to verify gateway alive

🛡️ Safety & Access Control

Self-chat only: Commands only work in personal chat
Access policies:
- allowlist: Only specified phone numbers
- pairing: Unknown senders get verification code
- open: Anyone can message
- disabled: Ignore all DMs
No prompt injection: Commands handled at gateway, never sent to AI
Message deduplication: Prevents duplicate processing

📊 Monitoring & Debugging

Gateway status: /status shows session, agent, model, elapsed time
Quota detection: Auto-detects usage limits, aborts expensive operations
Response times: Tracks how long operations take
Connection info: Shows if gateway is running, session IDs, etc.
Verbose mode: WA_VERBOSE=1 opencode-whatsapp run for debugging

📦 Media Support

Supported types:

Images: JPEG, PNG, GIF, WebP (camera photos, screenshots)
Audio: MP3, M4A, OGG, WAV (voice messages, music files)
Documents: PDF, DOCX, TXT, etc. (reports, essays)
Video: MP4, WebM (clips, tutorials)
Stickers: WebP (WhatsApp stickers)

How it works:

Send media file + optional caption from WhatsApp
Gateway downloads media from WhatsApp servers
Converts to base64 data URL (self-contained)
Sends to Claude with text message
Claude analyzes both text and media together
Responds with analysis/summary

Constraints:

Maximum 50MB per file (Baileys limit)
Large files may timeout
Media extraction handles errors gracefully

Configuration

Setup via CLI

opencode-whatsapp config

Prompts for:

Phone linking: Scan QR in WhatsApp → Linked Devices
Phone type:
- Personal: Your main phone (shows as "Me")
- Dedicated: Separate phone for OpenCode only
Access policy: Who can message the gateway
Allowed numbers: E.164 format (e.g., +1234567890)
Command prefix: Optional (e.g., "/bot" to use "/bot /help")

Config File Location

~/.local/state/opencode/whatsapp/pinned-session.json

Stores:

Currently pinned session ID
Per-session model selections
Persists across restarts

Environment Variables

# Session to start with
export OPENCODE_WA_SESSION="ses_xxx..."

# Enable verbose logging
export WA_VERBOSE=1

# Run gateway
opencode-whatsapp run

Commands Reference

Complete command list with examples. See COMMANDS.md for detailed documentation.

Session Commands

/sesslist              List all sessions
/sess <n>             Pin session by index
/sess <id>            Pin session by ID
/sess show            Show currently pinned
/sess d <n> yes       Delete session (confirm)
/new                  Create new session

Model Commands

/modellist            List available models
/model                Show selected model
/model <n>            Select by list number
/model provider/id    Select by full ID

Control Commands

/help                 Show this help
/status               Show gateway status
/stop                 Stop current operation
/ping                 Health check

Use Cases & Examples

Use Case 1: Technical Research

User: /new                                          Create session
User: /modellist                                    See models
User: /model 3                                      Pick Claude Opus (better reasoning)
User: [sends PDF] Explain the methodology           Claude analyzes PDF
User: But what about section 3.2?                   Follows up in same session
User: /status                                       Verify Opus was used

Use Case 2: Daily Workflow

User: /sesslist                                     See sessions
User: /sess 2                                       Pin Work session
User: What's my task for today?                     Routes to Work session
User: /sess 3                                       Switch to Personal session
User: What should I cook?                           Routes to Personal session
User: [sends recipe image] How long does this take? Claude analyzes image

Use Case 3: Model Comparison

User: /new                                          New session
User: /modellist                                    See models
User: Explain quantum entanglement                  Uses default model
User: /model 5                                      Switch to different model
User: Explain quantum entanglement again            Different model, different explanation
User: /model 2                                      Switch back

Use Case 4: Long Operations

User: Generate a 5000-word essay                    AI starts generating
[waiting 20 seconds...]
User: /stop                                         Cancel operation
User: /model gpt-4                                  Try faster model
User: Generate a 2000-word summary instead         AI completes quickly

Troubleshooting

Gateway won't start

# Check WhatsApp is linked
opencode-whatsapp config

# Check connection
opencode-whatsapp status

# Enable verbose logging
WA_VERBOSE=1 opencode-whatsapp run

Messages not routing to AI

/status                        Check if session is pinned
/sesslist                      List available sessions
/sess 1                        Pin a session
Your message                   Now routes to AI

Model selection not working

/modellist                     See available models
/model 1                        Select by number
/status                        Verify selection

Slow responses

/status                        Check which model is running
/modellist                     Try a faster (free) model
/model 2                       Switch to faster model

Media not downloading

/ping                          Check gateway health
Try smaller file              Large files may timeout
/stop                         Clear queue
Try again                     Retry with smaller file

"No pinned session" error

/sesslist                      View sessions
/new                          Create new session
/sess 1                       Pin it
Your message                  Now routes

System Requirements

Device

Phone: Any modern smartphone with WhatsApp installed
Connection: Internet-enabled (WiFi or mobile data)
Account: WhatsApp account active on phone

Server/Computer

Node.js: 18+ (16+ may work but not supported)
Disk: 500MB minimum (more for large session histories)
Memory: 512MB minimum (1GB+ recommended)
Internet: Stable connection (auto-reconnects on loss)

Baileys Version

Currently using: v7.0.0-rc.9
Required for: Media extraction, LID mapping (v6 doesn't work properly)
Auto-managed: Package manager handles version

Architecture & Design

Message Flow

Receive: Baileys WebSocket receives proto.IWebMessageInfo
Normalize: Validate JID, extract E.164 numbers, deduplicate
Enrich: Extract text, media, location, reply context
Handle: Detect control commands or route to AI
Route: SessionPrompt.prompt() with text + optional media
Parse: Extract text from response, format nicely
Send: sock.sendMessage() back to WhatsApp

Command System

Registry: Single COMMANDS array in commands.ts
Detection: isCommand() checks message against registry
Handlers: Each command has handler in handleMessage()
No injection: Commands never sent to AI (handled at gateway)
Formatters: ResponseFormatter class provides consistent formatting

Session Persistence

Storage: OpenCode database (~/.openclaw/sessions/)
Pinned state: Saved in pinned-session.json
Per-session models: Each session remembers its model choice
Survives restarts: Gateway stops/starts without losing context

Media Processing

Download: Baileys downloadMediaMessage() fetches from servers
Conversion: Buffer → base64 data URL (self-contained)
Limits: 50MB per file (enforced by Baileys)
Types: All WhatsApp media types supported

Access Control

Self-chat only: Commands restricted to personal chat
Policies: Allowlist, pairing, open, disabled
Phone numbers: E.164 format (+country code)
No relay: Can't use gateway to send to others
Safety by default: Most restrictive policy recommended

Performance Considerations

Response Time

Typical: 2-5 seconds per message
Large files: 10-30 seconds (file processing)
Complex reasoning: 30+ seconds (model thinking)
Quota detection: <3 seconds (immediate abort)

Database Size

Per session: 10-50KB for typical conversation
With media: 1-50MB depending on file sizes
With long history: Can grow to hundreds of MB
Management: Session.messages() API manages cleanup

Connection Stability

Reconnection: Automatic 3s backoff when disconnected
Offline queuing: Messages held while reconnecting
Timeout: 30-minute idle timeout (phone locked)
Graceful degradation: Connection loss doesn't lose data

Rate Limiting

Message queue: Serialized (one at a time)
Quota detection: Automatic abort if limit reached
Model limits: Vary by provider/model
Free tier: May have rate limits

Security & Privacy

Data Storage

Sessions: Stored locally in OpenCode database
Credentials: Stored in extension auth directory
No cloud: All data stays on your server
Encryption: WhatsApp encryption handled by Baileys

Access Control

Allowlist: Only specified phone numbers can message
Pairing: Unknown senders must enter verification code
Disabled: Can completely disable DM routing
Self-chat: Can use gateway without public access

Privacy

No logging: Messages not logged to stdout by default
Verbose mode: Only when explicitly enabled
No telemetry: No data sent anywhere
Local only: All processing on your machine

Development & Contribution

File Structure

extensions/whatsapp/
├── src/
│   ├── commands.ts        ← Add new commands here
│   ├── formatters.ts      ← Consistent response formatting
│   ├── constants.ts       ← Single source for values
│   ├── gateway.ts         ← Connection management
│   ├── inbound/
│   │   ├── monitor.ts     ← Message handling
│   │   ├── media.ts       ← Media extraction
│   │   ├── extract.ts     ← Text/location extraction
│   │   └── types.ts       ← WebInboundMessage type
│   └── ...
├── COMMANDS.md            ← Command documentation
├── README.md              ← This file
└── ARCHITECTURE.md        ← Deep dive into design

Adding Commands

Add to COMMANDS array in commands.ts
Add handler in whatsapp.ts handleMessage()
Add formatter in ResponseFormatter class
Help auto-generates from registry

Extending Media Support

Add detector in monitor.ts enrichInboundMessage()
Add extraction in media.ts extractMediaFromMessage()
Update MEDIA_MAX_SIZE_MB in constants.ts

Version History

v1.0 (2026-03-29) ✨ Production Release

✅ Full media support (images, audio, documents, video)
✅ Channel awareness (bot knows it's WhatsApp)
✅ 10 commands (session, model, control)
✅ Session persistence
✅ Quota detection
✅ Per-session models
✅ Multi-session support
✅ Safe command handling (no injection)

Previous Versions

v0.9 — Refactored command system
v0.8 — Session management
v0.7 — Model selection
v0.6 — Initial release

Support & Documentation

Related files:

COMMANDS.md — Command reference (detailed)
ARCHITECTURE.md — Deep dive into design
opencode-whatsapp-complete-review.md — Complete system review

Get help:

/help in WhatsApp for command help
opencode-whatsapp config for setup help
GitHub issues for bugs
Documentation for implementation details

License & Credits

Part of: OpenCode AI Framework Based on: OpenClaw WhatsApp extension Technology: Baileys WebSocket library, Claude AI Last Updated: 2026-03-29

For AI Bots

This README contains everything needed to help users with WhatsApp gateway:

Understanding what commands do (see COMMANDS.md)
Troubleshooting connection issues
Recommending commands based on user intent
Explaining workflows and best practices

When a user asks "How do I..." or "Why can't I...", reference this documentation to provide accurate, helpful guidance.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

OpenCode WhatsApp Gateway

Install

Quick Start

1. Configure

2. Run

3. Use

How It Works

Features

🤖 AI Integration

📋 Session Management

🎯 Model Selection

💬 Control Commands

🛡️ Safety & Access Control

📊 Monitoring & Debugging

📦 Media Support

Configuration

Setup via CLI

Config File Location

Environment Variables

Commands Reference

Session Commands

Model Commands

Control Commands

Use Cases & Examples

Use Case 1: Technical Research

Use Case 2: Daily Workflow

Use Case 3: Model Comparison

Use Case 4: Long Operations

Troubleshooting

Gateway won't start

Messages not routing to AI

Model selection not working

Slow responses

Media not downloading

"No pinned session" error

System Requirements

Device

Server/Computer

Baileys Version

Architecture & Design

Message Flow

Command System

Session Persistence

Media Processing

Access Control

Performance Considerations

Response Time

Database Size

Connection Stability

Rate Limiting

Security & Privacy

Data Storage

Access Control

Privacy

Development & Contribution

File Structure

Adding Commands

Extending Media Support

Version History

v1.0 (2026-03-29) ✨ Production Release

Previous Versions

Support & Documentation

License & Credits

For AI Bots