bpm4b

v12.0.0

Published

a month ago

Professional Multimedia Converter - Local Kokoro-82M TTS & Premium Audio Conversion

Downloads

231

0High
0Medium
0Low

jdjchelp

mp3 m4b audiobook converter ffmpeg epub epub-creator tts voice-cloning metadata-editor batch-merge kokoro ai-audiobook document-to-epub multimedia-suite

BPM4B

Professional Multimedia Processing Suite

Convert MP3 ↔ M4B · Generate AI Audiobooks · Process Documents to Audio

npm install bpm4b
# Node.js

pip install bpm4b
# Python

What's New

| Version | Feature | | ------- | ------------------------------------------------------------------------------------------------- | | v12 | ⚡ 5x faster processing · EPUB to Audiobook · 50+ doc formats · Google Colab · 99.8%+ TTS coverage OCR Pdfs but will be introduced in v13 | | v11 | ✍️ Interactive Pro Editor · Neural Narration Studio · Multi-voice dialogue | | v10 | 📚 M4B → MP3 · Document to Audiobook · Audio Format Converter |

Installation

npm

npm install bpm4b
cd bpm4b

PyPI

pip install bpm4b
cd bpm4b

Global CLI

npm install -g bpm4b

# Development
npm link

Update

npm update -g bpm4b

Premium Narration (Abogen)

Abogen is the recommended TTS engine powered by Kokoro-82M.

Install espeak-ng

https://github.com/espeak-ng/espeak-ng/releases

Option 1 — Install using uv (Recommended)

# NVIDIA CUDA 12.8 — recommended
uv tool install --python 3.12 abogen[cuda] \
  --extra-index-url https://download.pytorch.org/whl/cu128 \
  --index-strategy unsafe-best-match

# NVIDIA CUDA 12.6 — older drivers
uv tool install --python 3.12 abogen[cuda126] \
  --extra-index-url https://download.pytorch.org/whl/cu126 \
  --index-strategy unsafe-best-match

# NVIDIA CUDA 13.0 — newer drivers
uv tool install --python 3.12 abogen[cuda130] \
  --extra-index-url https://download.pytorch.org/whl/cu130 \
  --index-strategy unsafe-best-match

# CPU only / AMD on Windows
uv tool install --python 3.12 abogen

mkdir abogen
cd abogen

python -m venv venv

# CMD
venv\Scripts\activate.bat

# PowerShell
venv\Scripts\activate.ps1

# NVIDIA GPUs
pip install torch==2.8.0+cu128 torchvision==0.23.0+cu128 torchaudio==2.8.0 \
  --index-url https://download.pytorch.org/whl/cu128

For more install methods, see INSTALL.md

brew install espeak-ng

# Apple Silicon
uv tool install --python 3.13 abogen \
  --with "kokoro @ git+https://github.com/hexgrad/kokoro.git,numpy<2"

# Intel
uv tool install --python 3.12 abogen \
  --with "kokoro @ git+https://github.com/hexgrad/kokoro.git,numpy<2"

For additional setup methods including pip, see INSTALL.md

Usage

Web Interface

bpm4b web

# Custom port
bpm4b web --port 8080

# Custom host
bpm4b web --host 127.0.0.1 --debug

Default URL:

http://localhost:5000

Google Colab

# Auto-detect tunnel
bpm4b web --enable-tunnel

# LocalTunnel
bpm4b web --enable-tunnel --tunnel-service localtunnel

# ngrok
bpm4b web --enable-tunnel --tunnel-service ngrok

See COLAB_USAGE.md for the full Colab setup guide.

CLI — MP3 to M4B

Basic

bpm4b convert input.mp3 output.m4b

Chapters (Seconds)

bpm4b convert book.mp3 book.m4b \
  --chapter "Prologue" 0 \
  --chapter "Chapter 1" 300 \
  --chapter "Chapter 2" 1800

Chapters (MM:SS)

bpm4b convert book.mp3 book.m4b \
  --chapter "Prologue" "0:00" \
  --chapter "Chapter 1" "5:00" \
  --chapter "Chapter 2" "30:00"

Mixed Formats

bpm4b convert book.mp3 book.m4b \
  --chapter "Intro" 0 \
  --chapter "Chapter 1" "6:30" \
  --chapter "Chapter 2" 3600

Help

bpm4b --help
bpm4b web --help
bpm4b convert --help

Chapter timestamps accept:
Integers (seconds)
"MM:SS"
"MM:SS.f"

npm Scripts

npm start

npm run web

npm run convert -- input.mp3 output.m4b

Programmatic Usage (Node.js)

const { convertMp3ToM4b } = require('bpm4b');

await convertMp3ToM4b(
  'input.mp3',
  'output.m4b',
  [
    {
      title: 'Chapter 1',
      start_time: 0
    },
    {
      title: 'Chapter 2',
      start_time: 3600
    }
  ]
);

Features

✍️ Interactive Pro Editor (v11)

Full text viewer for editing chapter content
Manual manifest controls
Merge chapters instantly
Rename titles
Exclude segments with one click
Dual boundary verification
Universal document support:
- PDF
- EPUB
- DOCX
- TXT
- Markdown
- More

🎙️ Neural Narration Studio (v11)

Automatic chapter announcements
Multi-voice dialogue generation
Narrative vs dialogue voice separation
Powered by Kokoro-82M

⚡ Performance (v12)

Parallel audio processing with configurable concurrency.

| Method | 100 Files | | -------------------- | ----------- | | Sequential | ~10 min | | Parallel (8 workers) | ~2 min | | Fast Mode + Parallel | ~1.5 min | | TTS Synthesis | 2–3x faster | | Text Coverage | 99.8%+ |

🎙️ EPUB to Audiobook (v12)

Abogen Engine (Default)

curl -X POST http://localhost:5000/api/epub-to-audiobook \
  -F "[email protected]" \
  -F "voice=default" \
  -F "language=en" \
  -F "speed=1.0"

BPM4B Engine

curl -X POST http://localhost:5000/api/epub-to-audiobook \
  -F "[email protected]" \
  -F "voice=af_sky" \
  -F "language=en" \
  -F "speed=1.0" \
  -F "engine=bpm4b"

Engine Comparison

| Feature | Abogen | BPM4B | | ---------------- | -------------- | ------------------- | | Stability | ✅ Reliable | ⚠️ Inconsistent | | Speed | Standard | ⚡ Faster | | Auto-install | ✅ Yes | — | | Parallel workers | — | ✅ Up to 8 | | Voices | 50+ Kokoro Voices | 50+ Kokoro Voices | | Memory Usage | Standard | Lower | | Recommended | ✅ Yes | Speed-critical only |

BPM4B engine may be faster for large files but can produce inconsistent results. Abogen is recommended for reliability.

Supported Languages & Voices

| Language | Voices | | --------------------- | ------ | | 🇺🇸 American English | 20 | | 🇬🇧 British English | 8 | | 🇨🇳 Chinese | 8 | | 🇯🇵 Japanese | 5 | | 🇮🇳 Hindi | 4 | | 🇧🇷 Portuguese | 3 | | 🇪🇸 Spanish | 3 | | 🇮🇹 Italian | 2 | | 🇫🇷 French | 1 |

Playback speed range:

0.5x → 2.0x

📄 Document to EPUB (v12)

curl -X POST http://localhost:5000/api/document-to-epub \
  -F "[email protected]" \
  -F "title=My Book" \
  -F "author=Author Name" \
  -F "language=en"

Documents

PDF
DOCX
DOC
DOCM
DOT
DOTX
ODT
ODM
OTT
ABW
WPD

Text

TXT
TEXT
ASC
ANSI
LOG
ME
MD
MARKDOWN
RTF
CSV

Web

HTML
HTM
XHTML
XHT
XML

Academic

Chapter 1
CHAPTER I
Part 1
Book 1
Volume 1
Markdown headers (##, ###)
HTML headings (h1, h2, h3)
Roman numerals (I, II, III)
Numbered lists (1. Title)

🗂️ File Conversion

| Feature | Description | Since | | ---------------------- | ----------------------------------- | ----- | | MP3 → M4B | Embedded chapters | v1 | | M4B → MP3 | High-fidelity extraction | v10 | | Document → Audiobook | PDF/Text via Kokoro-82M | v10 | | Audio Format Converter | MP3/WAV/FLAC/AAC/OGG/ALAC | v10 | | Folder → M4B | Batch conversion with auto chapters | v10 | | EPUB → Audiobook | Multi-language dual-engine support | v12 |

Audio Format Converter

Supported conversions:

MP3 ↔ WAV
FLAC → MP3
AAC → OGG
OGG → WAV
ALAC → FLAC

Supported quality:

128k
192k
256k
320k
Lossless

🎨 Theme System (v11)

25+ themes with real-time switching.

Included Themes

Dark
Light
Matrix
Cyberpunk
Dracula
Monokai
Vaporwave
Emerald Forest
Purple Galaxy
Sunset Orange
Blue Ocean
Cherry Blossom
Golden Hour
Midnight Depth
Royal Velvet
Arctic Frost
Volcanic Ash
Coffee House
Leafy Greens
Ocean Breeze
Lavender Dream
Steel City
Ruby Red
Solarized Light
High Contrast

📝 Metadata Editor

Edit title
Edit author
Edit genre
Edit description
Auto-fill from Open Library
Upload and embed cover art
One-click apply + download

🎤 Voice Cloning — KokoClone

Upload a 3–10 second voice sample
Synthesize text
Re-voice existing audio
Multi-language support:
- English
- Hindi
- French
- Japanese
- Chinese
- Italian
- Portuguese
- Spanish

Kokoro-ONNX
Kanade voice conversion

🔀 Batch Merge — Audio Glue

Upload multiple MP3s
Merge sequentially
Add metadata
Add cover art
Download a single combined file

⏱️ Automatic Chapter Builder

Enter title + duration
Automatic timestamp generation
HH:MM:SS support
Batch import/export
Real-time preview
Minutes or seconds toggle

🌐 Google Colab Defaults (v12)

| Setting | Value | | ------------------ | ------------- | | Concurrency | All CPU cores | | Fast Mode | Enabled | | Audio Quality | 128k | | Max Parallel Files | 16 |

⚙️ UI & Settings

25+ real-time themes
Glassmorphism design
Drag-and-drop upload
SSE progress bars
Live terminal logs
Copy-to-clipboard commands
FFmpeg command preview

Web Interface Guide

| Tool | Usage | | --------------- | -------------------------------------- | | MP3 → M4B | Upload MP3 and add chapters | | M4B → MP3 | Convert M4B/M4A to MP3 | | Audiobook Gen | Upload PDF/Text and generate narration | | Metadata Editor | Edit metadata and embed covers | | KokoClone | Voice cloning and synthesis | | Audio Glue | Merge multiple MP3 files |

Advanced — Folder to M4B

const { folderToM4b } = require('bpm4b');

await folderToM4b(
  '/path/to/folder',
  '/path/to/output.m4b',
  {
    concurrency: 8,
    fastMode: true,
    audioQuality: '128k',

    metadata: {
      title: 'My Audiobook',
      author: 'Author Name',
      genre: 'Audiobook'
    },

    onProgress: (percent, msg) => {
      console.log(`${percent}% - ${msg}`);
    }
  }
);

API Reference

`POST /api/mp3-to-m4b`

| Field | Type | Description | | ---------- | ---- | ---------------------- | | mp3_file | file | Input MP3 | | chapters | JSON | Optional chapter array |

Example:

[
  {
    "title": "Chapter 1",
    "start_time": 0
  },
  {
    "title": "Chapter 2",
    "start_time": "6:30"
  },
  {
    "title": "Chapter 3",
    "start_time": 3600
  }
]

`POST /api/convert`

| Field | Type | Description | | --------------- | ------ | ----------------- | | source_file | file | MP3 or M4B | | output_name | string | Custom filename | | audio_quality | string | Bitrate | | chapters | JSON | Optional chapters |

`POST /api/generate-audiobook`

| Field | Type | Description | | ------------- | ------ | --------------- | | doc_file | file | PDF or Text | | voice | string | Kokoro voice | | output_name | string | Output filename |

`POST /api/epub-to-audiobook`

| Field | Type | Description | | --------------- | ------ | ------------------- | | document_file | file | EPUB file | | voice | string | Voice name | | language | string | Language code | | speed | float | 0.5–2.0 | | engine | string | abogen or bpm4b |

`POST /api/document-to-epub`

| Field | Type | Description | | --------------- | ------ | ------------------ | | document_file | file | Supported document | | title | string | Book title | | author | string | Author | | language | string | Language code |

`POST /api/metadata/extract`

| Field | Type | Description | | ------ | ---- | -------------- | | file | file | M4B/M4A source |

Example response:

{
  "title": "Book Title",
  "author": "Author Name",
  "genre": "Fiction",
  "description": "Book description...",
  "coverBase64": "data:image/jpeg;base64,..."
}

`POST /api/metadata/apply`

| Field | Type | Description | | -------------- | ------ | -------------------- | | file | file | M4B/M4A source | | metadata | JSON | Metadata object | | cover_base64 | string | Optional cover image |

`POST /api/convert-audio`

| Field | Type | Description | | --------------- | ------ | ------------------------- | | file | file | Audio source | | target_format | string | mp3/wav/flac/aac/ogg/alac | | quality | string | Bitrate or lossless | | job_id | string | Optional SSE job ID |

`GET /api/health`

Returns:

Service health status
FFmpeg availability

Project Structure

bpm4b/
├── bin/
│   └── bpm4b.js
├── lib/
│   ├── core.js
│   ├── server.js
│   ├── audiobook-builder.js
│   └── chapter-detector.js
├── templates/
│   └── index.ejs
├── api/
│   └── index.js
├── uploads/
├── outputs/
├── vercel.json
└── package.json

Deploying to Vercel

Push project to GitHub
Import into Vercel
Deploy

vercel.json is auto-detected.

Important

Serverless functions may time out:

Hobby: 10s
Pro: 60s

For large files:

Use bpm4b web
Deploy on a dedicated server
Or upgrade Vercel plan

Notes

Max upload size: 2GB
Real-time SSE progress tracking
Kokoro AI runs locally
No API keys required
No external costs
FFmpeg bundled
Windows / macOS / Linux support

Typical M4B output size:

0.96GB – 2GB per hour

Depends on bitrate.

Contact

X (Twitter)

@jdjchelp

License

MIT License