n8n-nodes-ttsbro

v0.1.6

Published

19 hours ago

Text-to-Speech n8n node using sherpa-onnx Kokoro model

Downloads

492

0High
0Medium
0Low

iamvaar

n8n-community-node-package n8n tts text-to-speech kokoro sherpa-onnx wasm ttsbro

TTS Bro - n8n Text-to-Speech Node

Text-to-Speech n8n node powered by sherpa-onnx and the Kokoro TTS model.

🎯 Pure JavaScript/WebAssembly - No native binary dependencies!
⚡ High Performance - Singleton TTS instance for fast subsequent calls
🔒 Offline - Runs completely locally, no API calls needed
🐳 Docker Ready - Works in containerized n8n environments

Features

High-Quality Neural TTS using Kokoro-82M model
Multiple Voices - 10+ speaker voices available
Adjustable Speed - 0.5x to 2.0x speech speed control
Multiple Output Formats - WAV or Raw PCM
Binary Output - Audio data as n8n binary property
Works in Docker - Pure WASM, no native dependencies

Installation

Option 1: Docker (Recommended)

Use the provided Dockerfile and docker-compose.yml:

# Download Kokoro model (304MB)
npm run download-model

# Build and run
docker-compose up --build

Option 2: n8n Community Nodes

Go to Settings > Community Nodes
Enter n8n-nodes-ttsbro
Click Install

Option 3: Manual Installation

# Install the node
cd ~/.n8n/custom
npm install n8n-nodes-ttsbro

# Download the model
cd node_modules/n8n-nodes-ttsbro
npm run download-model

# Restart n8n

Usage

Add the TTS Bro node to your workflow
Configure:
- Text: The text to convert to speech
- Voice: Select a speaker voice (0-9)
- Speed: Speech speed (default: 1.0)
- Output Format: WAV or Raw PCM
- Binary Property: Name for the output (default: "audio")
Output is a binary audio file that can be:
- Saved to disk
- Uploaded to cloud storage
- Sent via messaging apps
- Played in browsers

Node Properties

| Property | Type | Default | Description | |----------|------|---------|-------------| | Text | string | - | Text to synthesize (required) | | Voice | options | Voice 0 | Speaker voice selection | | Speed | number | 1.0 | Speech speed (0.5-2.0) | | Format | options | WAV | Output format (WAV/Raw PCM) | | Binary Property | string | audio | Output property name |

Output

{
  "json": {
    "text": "Hello world!",
    "voice": 0,
    "speed": 1.0,
    "format": "wav",
    "sampleRate": 24000,
    "duration": 1.23,
    "byteLength": 54382
  },
  "binary": {
    "audio": { ... }  // Binary audio data
  }
}

Technical Details

TTS Engine: sherpa-onnx via WebAssembly
Model: Kokoro-82M (English, multi-voice)
Sample Rate: 24000 Hz
Bit Depth: 16-bit
Channels: Mono

Model Info

The Kokoro model is:

82 million parameters - Compact yet high quality
Apache 2.0 licensed - Free for commercial use
Multi-voice - Multiple speaker styles
English focused - Optimized for English text

Requirements

Node.js >= 18
n8n >= 1.0.0
~150MB disk space for model files

Development

# Clone and install
git clone https://github.com/your-username/n8n-nodes-ttsbro.git
cd n8n-nodes-ttsbro
npm install

# Download model
npm run download-model

# Build
npm run build

# Run with n8n
npm run start

Legal & License

This project is licensed under the Apache License 2.0.

[!IMPORTANT] This distribution includes components and models with different licenses. Please see the NOTICE file for full third-party attribution and license details.

Third-Party Components

| Component | License | Notes | |-----------|---------|-------| | sherpa-onnx | Apache 2.0 | TTS inference engine | | Kokoro-82M | Apache 2.0 | TTS Model weights | | ONNX Runtime | MIT | Neural network inference runtime | | eSpeak NG | GPL v3 | Data/Phonemes used by the model |

Note on GPL Compatibility: The Kokoro model utilizes data derived from eSpeak NG (GPL v3). If you modify and redistribute the model files or this package, you must comply with the terms of the GPL v3 where applicable.

Credits

sherpa-onnx - For the amazing WebAssembly TTS engine.
hexgrad - For training and releasing the Kokoro model.
n8n - For the workflow automation platform.