npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

claude-gemini-multimodal-bridge

v1.1.0

Published

Enterprise-grade AI integration bridge connecting Claude Code, Gemini CLI, and Google AI Studio with intelligent routing and advanced multimodal processing capabilities

Readme

🌉 Claude-Gemini Multimodal Bridge

Unifying the Power of AI.

An MCP bridge that seamlessly integrates Claude Code, Gemini CLI, and Google AI Studio

🇯🇵 日本語版📦 NPM🐛 Issues


npm version License: MIT Node.js TypeScript

MCP Compatible Gemini Claude GitHub Sponsors

Windows macOS Linux


🤔 Why CGMB?

🔄 Multi-Model Orchestration

Optimally integrates Claude's reasoning power, Gemini CLI's search capabilities, and AI Studio's generation power. Ahead of the 2026 AI trend: "Specialized AI Collaboration"

⚡ Zero Configuration

Complete with a single npm install. Tedious setup is automated

🎯 MCP Standard Compliant

Follows the Anthropic Model Context Protocol. Enterprise-grade reliability with 95% self-healing rate


✨ What's New in v1.1.0

| Feature | Description | |---------|-------------| | 🪟 Full Windows Support | Native support for both CLI and MCP | | 📝 Enhanced OCR Processing | Automatic text extraction from scanned PDFs | | 🚀 Latest Gemini Models | Support for gemini-2.5-flash, gemini-3-flash | | 🔐 OAuth Authentication | File-based authentication compatible with Claude Code | | 🌐 Auto Translation | Japanese to English translation for image generation | | 📊 Smart Routing | PDF URLs to AI Studio, web pages to Gemini CLI | | ⚡ Performance Optimization | Reduced timeouts, lazy loading, caching | | 🛡️ Error Recovery | 95% self-healing with exponential backoff |


🏗️ Architecture

flowchart TD
    A[Claude Code] --> B[CGMB]

    B --> C[Gemini CLI]
    B --> D[Claude Code]
    B --> E[AI Studio]

| Layer | Specialization | Timeout | |:-----:|:---------------|:-------:| | 🔍 Gemini CLI | Web search, real-time information | 30s | | 🧠 Claude Code | Complex reasoning, code analysis | 300s | | 🎨 AI Studio | Image generation, audio synthesis, OCR | 120s |


🚀 Quick Start

📋 Prerequisites

  • Node.js ≥ 22.0.0
  • Claude Code CLI installed
  • Gemini CLI (auto-installed)

📦 Installation

npm install -g claude-gemini-multimodal-bridge

💡 The postinstall script automatically:

  • Installs Gemini CLI
  • Sets up Claude Code MCP integration
  • Creates .env template
  • Verifies system requirements

🔑 Environment Setup

Create a .env file in your working directory:

AI_STUDIO_API_KEY=your_api_key_here

🔗 Get API key: https://aistudio.google.com/app/apikey

🎯 Gemini CLI Authentication

gemini

💬 Get Started with Claude Code

I installed CGMB via NPM. Please check my current environment for the cgmb command and help me use it.

💡 Usage Examples

CGMB integrates seamlessly with Claude Code. Just use the "CGMB" keyword:

# 🎨 Image generation
"CGMB generate an image of a futuristic city"

# 📄 Document analysis (use absolute paths)
"CGMB analyze the document at /full/path/to/report.pdf"

# 🌐 URL analysis
"CGMB analyze https://example.com/document.pdf"

# 🔍 Web search
"CGMB search for the latest AI news"

# 🎵 Audio generation
"CGMB create audio saying 'Welcome to our podcast'"

# 📝 OCR-enabled PDF analysis
"CGMB analyze this scanned PDF document with OCR"

🔄 Automatic Routing

  1. Include "CGMB" in your Claude Code request
  2. CGMB automatically routes to the optimal AI layer:
    • 🔍 Gemini CLI: Web search, latest information
    • 🎨 AI Studio: Images, audio, file processing
    • 🧠 Claude Code: Complex reasoning, code analysis

🤖 Models Used

| Purpose | Model ID | Layer | |:-------:|:---------|:-----:| | 🔍 Web Search | gemini-3-flash | Gemini CLI | | 🎨 Image Generation | gemini-2.5-flash-image | AI Studio | | 🎵 Audio Generation | gemini-2.5-flash-preview-tts | AI Studio | | 📄 Document Processing | gemini-2.5-flash | AI Studio | | 📝 OCR/Text Extraction | gemini-2.5-flash | AI Studio | | 🔮 General Multimodal | gemini-2.0-flash-exp | AI Studio |


📈 Performance

80%

Authentication Overhead Reduction

60-80%

Search Cache Hit Rate

95%

Automatic Error Recovery Rate


📄 PDF Processing & OCR

✨ OCR Features

  • ✅ Supports both text-based and scanned PDFs
  • ✅ Automatic OCR detection
  • ✅ Native OCR processing via Gemini File API
  • ✅ Multi-language support

📋 Processing Workflow

PDF Input → Upload → OCR Processing → Content Analysis → Output Results

📁 Supported Formats

  • Text-based PDFs
  • Scanned PDFs (OCR processing)
  • Image-based PDFs (OCR conversion)
  • Mixed content
  • Complex layouts (tables, charts, formatted content)

📂 File Organization

Generated content is automatically organized:

output/
├── images/     # 🎨 Generated images
├── audio/      # 🎵 Generated audio files
└── documents/  # 📄 Processed documents

Access via Claude Code:

  • get_generated_file: Retrieve specific files
  • list_generated_files: List all generated files
  • get_file_info: Get file metadata

🔧 Configuration

Environment Variables

# Required
AI_STUDIO_API_KEY=your_api_key_here

# Optional
GEMINI_API_KEY=your_api_key_here
ENABLE_CACHING=true
CACHE_TTL=3600
LOG_LEVEL=info

MCP Integration

CGMB automatically configures Claude Code MCP integration:

  • 📍 Config path: ~/.claude-code/mcp_servers.json
  • ⚡ Direct Node.js execution
  • 🔒 Safe merge without overwriting existing servers

🪟 Windows Environment

CGMB fully supports Windows in v1.1.0:

| Feature | Status | |---------|:------:| | CLI | ✅ All commands work | | MCP Integration | ✅ MCP tool calls work correctly | | Path Resolution | ✅ Automatically handles C:\path\to\file format | | Gemini CLI | ✅ Full compatibility with Windows version |

# Absolute paths recommended
cgmb analyze "C:\Users\name\Documents\report.pdf"

# Set environment variable (PowerShell)
$env:AI_STUDIO_API_KEY = "your_api_key_here"

# Set environment variable (Command Prompt)
set AI_STUDIO_API_KEY=your_api_key_here

🐧 Linux / WSL Environment

CGMB works fully on Linux and WSL:

| Feature | Status | |---------|:------:| | CLI | ✅ All commands work | | MCP Integration | ✅ MCP tool calls work correctly | | Path Resolution | ✅ Supports /mnt/ WSL paths and Unix paths | | Gemini CLI | ✅ Full compatibility with Linux version |

# Use Unix path format
cgmb analyze /home/user/documents/report.pdf

# WSL environment example
cgmb analyze /mnt/c/Users/name/Documents/report.pdf

# Set environment variables
export AI_STUDIO_API_KEY="your_api_key_here"
export CGMB_CHAT_MODEL="gemini-2.5-flash"

🔍 Troubleshooting

Debug Mode

export CGMB_DEBUG=true
export LOG_LEVEL=debug
cgmb serve --debug

OCR and PDF Processing Issues

If OCR results are inaccurate:

  • Use high-resolution scanned PDFs (300+ DPI)
  • Ensure clear, high-contrast text
  • Avoid skewed or rotated documents

If large documents timeout:

  • Split large PDFs before processing (limit: 50MB, 1,000 pages)
  • Extend timeout: export AI_STUDIO_TIMEOUT=180000

💰 API Costs

CGMB uses pay-per-use APIs:


📁 Project Structure

src/
├── core/           # 🎯 Main MCP server and layer management
├── layers/         # 🔌 AI layer implementations
├── auth/           # 🔐 Authentication system
├── tools/          # 🛠️ Processing tools
├── workflows/      # 📋 Workflow implementations
├── utils/          # 🔧 Utilities and helpers
└── mcp-servers/    # 🌐 Custom MCP servers

🔗 Links

📦 Project

🔧 Related Tools

📜 Terms of Service


📜 Version History

v1.1.0 (2026-01-10)

  • 🪟 Full Windows Support: Native Windows support for both CLI and MCP
  • 📝 Enhanced OCR: Automatic OCR processing for image-based PDFs
  • 🚀 Latest Gemini Models: Support for gemini-2.5-flash, gemini-3-flash
  • Improved MCP Integration: Optimized async layer initialization
  • 📈 Performance Improvements: Reduced timeouts, lazy loading, enhanced caching
  • 🛡️ Error Recovery: 95% self-healing rate with exponential backoff

v1.0.4

  • 🎉 Initial release
  • 🏗️ 3-layer architecture implementation
  • 🎨 Basic multimodal processing

📄 License

MIT - See LICENSE


Made with ❤️ by goodaymmm

⭐ If this project helped you, please give it a star!

Sponsor