@mcptoolshop/voice-soundboard-core

v0.2.3

Published

2 days ago

Backend-agnostic core library for MCP Voice Soundboard — 48 voices, 9 languages, validation, SSML-lite, chunking, emotion spans, SFX tags, and schemas.

0High
0Medium
0Low

mikefrilot

mcp voice tts text-to-speech soundboard ssml speech-synthesis ai-agent model-context-protocol

Highlights

MCP native — stdio transport, works with Claude Desktop, Cursor, and any MCP client
5 tools — voice_speak, voice_dialogue, voice_status, voice_interrupt, voice_inner_monologue
48 approved voices, 9 languages — English (American + British), Japanese, Mandarin, Spanish, French, Hindi, Italian, Brazilian Portuguese. Curated presets: narrator, announcer, whisper, storyteller, assistant
Emotion spans — 8 emotions via [happy]...[/happy] inline markup
SSML-lite — <break>, <emphasis>, <prosody> without full SSML complexity
SFX tags — [ding], [chime], [whoosh], [tada], [error], [click] inline sound effects
Multi-speaker dialogue — Speaker: line format with auto-cast and pause directives
Guardrails — rate limiting, concurrency semaphore, request timeouts, path traversal protection, secret redaction
Swappable backends — Mock (built-in), HTTP proxy, Python bridge, or bring your own

Quick Start

npx @mcptoolshop/voice-soundboard-mcp

Or install globally:

npm install -g @mcptoolshop/voice-soundboard-mcp
voice-soundboard-mcp

Claude Desktop / MCP Client Config

Add to your MCP client configuration (e.g. claude_desktop_config.json):

{
  "mcpServers": {
    "voice-soundboard": {
      "command": "npx",
      "args": ["-y", "@mcptoolshop/voice-soundboard-mcp"]
    }
  }
}

With options:

{
  "mcpServers": {
    "voice-soundboard": {
      "command": "npx",
      "args": [
        "-y", "@mcptoolshop/voice-soundboard-mcp",
        "--artifact=path",
        "--output-dir=/tmp/voice-output",
        "--timeout=30000",
        "--max-concurrent=2"
      ]
    }
  }
}

MCP Tools

`voice_speak`

Synthesize speech from text.

text:         "Hello world!"
voice?:       "am_fenrir"          # Voice ID or preset name
speed?:       1.0                  # 0.5 - 2.0
format?:      "wav"                # wav | mp3 | ogg | raw
artifactMode?: "path"             # path | base64
sfx?:         true                # Enable [ding], [chime] etc.

`voice_dialogue`

Multi-speaker dialogue synthesis.

script:       "Alice: Hello!\nBob: Hey there!"
cast?:        { "Alice": "af_sky", "Bob": "am_fenrir" }
speed?:       1.0
concat?:      true                 # Combine into single file
debug?:       true                 # Include cue_sheet

`voice_status`

Returns engine health, available voices, presets, and backend info. No arguments.

`voice_interrupt`

Stop or rollback active synthesis.

streamId?:    "stream-123"
reason?:      "user_spoke"         # user_spoke | context_change | timeout | manual

`voice_inner_monologue`

Ephemeral micro-utterances for ambient narration. Requires --ambient flag or VOICE_SOUNDBOARD_AMBIENT_ENABLED=1.

text:         "Interesting..."     # Max 500 chars, auto-redacted
category?:    "thinking"           # general | thinking | observation | debug

Voices

48 voices across 9 languages. Language is auto-inferred from the voice ID prefix — no configuration required.

| Prefix | Language | |--------|----------| | af_ / am_ | English (American) | | bf_ / bm_ | English (British) | | jf_ / jm_ | Japanese | | zf_ / zm_ | Mandarin Chinese | | ef_ / em_ | Spanish | | ff_ | French | | hf_ / hm_ | Hindi | | if_ / im_ | Italian | | pf_ / pm_ | Brazilian Portuguese |

See the full voice list for all 48 voices with names and styles.

Presets

| Preset | Voice | Speed | Description | |--------|-------|-------|-------------| | narrator | bm_george | 0.95 | Calm documentary style | | announcer | am_onyx | 1.05 | News anchor energy | | whisper | af_aoede | 0.85 | Soft, intimate | | storyteller | bf_emma | 0.90 | Warm bedtime-story feel | | assistant | af_jessica | 1.0 | Neutral, helpful |

Emotion Spans

Wrap text in emotion tags to control prosody:

[happy]Great news![/happy] But [sad]I have to go.[/sad]

Supported: happy, sad, angry, fearful, surprised, disgusted, calm, excited

CLI Flags

| Flag | Default | Description | |------|---------|-------------| | --artifact=path\|base64 | path | Audio delivery mode | | --output-dir=<path> | <tmpdir>/voice-soundboard/ | Output directory | | --backend=mock\|http | mock | Backend selection | | --backend-url=<url> | — | HTTP backend URL | | --ambient | off | Enable inner-monologue system | | --max-concurrent=<n> | 1 | Max concurrent synthesis requests | | --timeout=<ms> | 20000 | Per-request timeout | | --retention-minutes=<n> | 240 | Auto-cleanup age (0 to disable) |

Packages

This is a pnpm monorepo with two publishable packages:

| Package | Description | npm | |---------|-------------|-----| | @mcptoolshop/voice-soundboard-core | Backend-agnostic core library (validation, SSML, chunking, schemas) | | | @mcptoolshop/voice-soundboard-mcp | MCP server with CLI, guardrails, and transport | |

Development

# Install
pnpm install

# Build
pnpm build

# Test (342 tests)
pnpm test

Part of MCP Tool Shop

Project Structure

mcp-voice-soundboard/
  packages/
    core/               @mcptoolshop/voice-soundboard-core
      src/
        limits.ts         SHIP_LIMITS, text/chunk limits
        schemas.ts        VoiceRequest, VoiceResponse, error codes
        artifact.ts       resolveOutputDir, path sandbox
        voices.ts         Approved voice registry + presets
        emotion.ts        Emotion span parser
        ssml/             SSML-lite parser + limits
        chunking/         Text chunker
        sfx/              SFX tag parser + registry
        sandbox.ts        Safe filenames, symlink checks
        ambient.ts        AmbientEmitter for inner monologue
        redact.ts         PII/secret redaction
    mcp-server/         @mcptoolshop/voice-soundboard-mcp
      src/
        server.ts         MCP tool registration + guardrail wiring
        cli.ts            CLI entrypoint (stdio transport)
        backend.ts        Backend abstraction + mock/HTTP
        concurrency.ts    SynthesisSemaphore
        rateLimit.ts      ToolRateLimiter (sliding window)
        timeout.ts        withTimeout utility
        retention.ts      Output file cleanup timer
        redact.ts         Server-level redaction
        validation.ts     Synthesis result validation
        tools/            Individual tool handlers
  assets/               Logo, audio event manifests
  docs/                 Architecture docs

Security

See SECURITY.md for vulnerability reporting.

See THREAT_MODEL.md for the full threat surface analysis.

| Project | Description | |---------|-------------| | soundboard-plugin | Claude Code plugin — slash commands, emotion-aware narration |

Support

Questions / help: Discussions
Bug reports: Issues
Security: SECURITY.md

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme