fal-mcp

v1.2.0

Published

19 days ago

MCP server for Fal AI image tools and Chatterbox TTS

0High
0Medium
0Low

danielrosehill

mcp fal fal-ai image-generation ai claude anthropic tts text-to-speech chatterbox

fal-mcp

MCP server for Fal AI image tools and local TTS with configurable models.

Features

Built-in models: Nano Banana Pro and Flux 2 ready to use
Image utilities: Background removal and upscaling tools
Text-to-Speech: Chatterbox TTS integration for local speech synthesis
Custom models: Add your own favorites via environment variable
Flexible auth: FAL_KEY only required for Fal tools (TTS works without it)

Installation

Claude Code

# Full installation with Fal AI + Chatterbox TTS
claude mcp add fal-mcp \
  -e FAL_KEY=your_api_key \
  -e CHATTERBOX_URL=http://localhost:8880 \
  -- npx -y fal-mcp

# Fal AI only (image tools)
claude mcp add fal-mcp -e FAL_KEY=your_api_key -- npx -y fal-mcp

# Chatterbox TTS only (no Fal key needed)
claude mcp add fal-mcp -e CHATTERBOX_URL=http://localhost:8880 -- npx -y fal-mcp

# With custom Fal models
claude mcp add fal-mcp \
  -e FAL_KEY=your_api_key \
  -e FAL_MODELS=fal-ai/flux-pro/v1.1,fal-ai/ideogram/v3 \
  -- npx -y fal-mcp

Verify Installation

claude mcp list

Environment Variables

| Variable | Required | Description | |----------|----------|-------------| | FAL_KEY | For image tools | Your Fal AI API key from fal.ai | | FAL_MODELS | No | Comma-separated list of additional Fal model IDs | | CHATTERBOX_URL | No | Chatterbox TTS server URL (default: http://localhost:8880) |

Built-in Models

| Alias | Model ID | Description | |-------|----------|-------------| | nano-banana-pro | fal-ai/nano-banana-pro | Fast, high-quality generation (default) | | flux-2 | fal-ai/flux-2 | Flux 2 text-to-image |

Tools

`generate_image`

Generate an image from a text prompt. Requires FAL_KEY.

Parameters:

prompt (required): Text description of the image
model: Model alias or full Fal ID (default: nano-banana-pro)
aspect_ratio: 21:9, 16:9, 4:3, 3:2, 1:1, 2:3, 3:4, 9:16, 9:21 (default: 16:9)
image_size: Preset sizes (square, square_hd, portrait_4_3, etc.) - use aspect_ratio instead for more control
seed: Integer for reproducible results

Examples:

"Generate an image of a sunset over mountains"
"Generate a 1:1 square image of a cat"
"Generate a 9:16 portrait using flux-2"
"Generate with aspect ratio 21:9: a cinematic landscape"

`remove_background`

Remove the background from an image using BiRefNet. Requires FAL_KEY.

Parameters:

image_url (required): URL of the image to process
model: General Use (Light), General Use (Heavy), or Portrait (default: Light)
operating_resolution: 1024x1024 or 2048x2048 (default: 1024x1024)
output_format: png or webp (default: png)
refine_foreground: Whether to refine edges (default: true)

Examples:

"Remove background from https://example.com/photo.jpg"
"Remove background with Portrait model from this headshot URL"

`upscale_image`

Upscale an image using Clarity Upscaler. Requires FAL_KEY.

Parameters:

image_url (required): URL of the image to upscale
upscale_factor: 1-4 (default: 2)
prompt: Optional guidance prompt (default: "masterpiece, best quality, highres")
creativity: How much the model can deviate, 0-1 (default: 0.35)
resemblance: How much to preserve original, 0-1 (default: 0.6)
seed: Integer for reproducible results

Examples:

"Upscale https://example.com/small.jpg by 4x"
"Upscale this image with high resemblance (0.9)"

`generate_speech`

Generate speech audio from text using Chatterbox TTS. Requires local Chatterbox server.

Parameters:

text (required): The text to convert to speech
voice: Voice to use (default: alloy)
output_format: mp3, wav, opus, or flac (default: mp3)
speed: Speech speed multiplier 0.25-4.0 (default: 1.0)
exaggeration: Emotion intensity 0.25-2.0 (default: 0.5)
cfg_weight: Pace/adherence control 0.0-1.0 (default: 0.5)
language: Language code (en, es, fr, de, ja, ko, etc.)

Examples:

"Generate speech: Hello, welcome to our podcast"
"Say 'Bonjour!' in French with high exaggeration"
"Convert this paragraph to speech and save as wav"

Supported Languages: Arabic, Danish, German, Greek, English, Spanish, Finnish, French, Hebrew, Hindi, Italian, Japanese, Korean, Malay, Dutch, Norwegian, Polish, Portuguese, Russian, Swedish, Swahili, Turkish

`list_models`

List all available tools and models (built-in + custom).

Adding Custom Models

Set FAL_MODELS with comma-separated Fal model IDs:

FAL_MODELS=fal-ai/flux-pro/v1.1,fal-ai/ideogram/v3,fal-ai/recraft/v3

Custom models get auto-generated aliases from their IDs:

fal-ai/flux-pro/v1.1 → flux-pro-v1.1
fal-ai/ideogram/v3 → ideogram-v3

You can use either the alias or full model ID when generating.

Chatterbox TTS Setup

Chatterbox TTS is a local text-to-speech server with voice cloning capabilities.

Running Chatterbox with Docker

docker run -d --name chatterbox-tts \
  -p 8880:8880 \
  --gpus all \
  travisvn/chatterbox-tts-api:latest

For AMD GPUs (ROCm)

docker run -d --name chatterbox-tts \
  -p 8880:8880 \
  --device=/dev/kfd --device=/dev/dri \
  -e HSA_OVERRIDE_GFX_VERSION=11.0.1 \
  travisvn/chatterbox-tts-api:rocm

Development

# Clone and install
git clone https://github.com/danielrosehill/fal-mcp.git
cd fal-mcp
npm install

# Run in dev mode
FAL_KEY=your_key npm run dev

# Run with Chatterbox only (no Fal key)
CHATTERBOX_URL=http://localhost:8880 npm run dev

# Build for production
npm run build

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

fal-mcp

Features

Installation

Claude Code

Verify Installation

Environment Variables

Built-in Models

Tools

generate_image

remove_background

upscale_image

generate_speech

list_models

Adding Custom Models

Chatterbox TTS Setup

Running Chatterbox with Docker

For AMD GPUs (ROCm)

Development

License

`generate_image`

`remove_background`

`upscale_image`

`generate_speech`

`list_models`