voiceai-cli

v0.1.10

Published

a day ago

Voiceai CLI — text-to-speech, speech-to-text, streaming

Downloads

989

0High
0Medium
0Low

picsoung

voiceai

The official Voiceai CLI — text-to-speech, speech-to-text, real-time streaming.

         _|    _|    _|_|_|  _|        _|      _|    _|_|_|
       _|    _|    _|        _|        _|_|    _|  _|
     _|    _|        _|_|    _|        _|  _|  _|  _|  _|_|
   _|    _|              _|  _|        _|    _|_|  _|    _|
 _|    _|          _|_|_|    _|_|_|_|  _|      _|    _|_|_|

   Voice AI for builders — text-to-speech, speech-to-text, real-time.

   ❯ 🗣  Text → Speech - Synthesize
     👂  Speech → Text - Transcribe
     ⚙️   Settings
     ❌   Quit

   ctrl+c quit

Run voiceai to open the interactive TUI above, or pass flags to script it.

Install

Homebrew (macOS, Linux)

brew install slng-ai/tap/voiceai

curl one-liner

curl -fsSL https://docs.slng.ai/install.sh | sh

Installs to /usr/local/bin/voiceai. To install elsewhere: curl -fsSL https://docs.slng.ai/install.sh | PREFIX=$HOME/.local/bin sh.

npm

npm i -g voiceai-cli

Package name is voiceai-cli; the installed binary is voiceai. The postinstall step downloads a pre-built binary for your platform. Use Homebrew or the curl one-liner if you want to skip that network call.

macOS Gatekeeper note

The pre-built macOS binary is currently unsigned. The first time you run it, Gatekeeper may block it. To clear the quarantine:

xattr -d com.apple.quarantine $(which voiceai)

Or right-click voiceai in Finder and choose Open once.

Configure

The fastest way:

voiceai login                          # interactive: prompts for profile name + key, verifies it

Or set values directly:

voiceai config set apiKey zpka_…

You can also set VOICEAI_API_KEY in your environment. The first time you launch the TUI without a key, it'll prompt for one and save it.

Get a key at https://app.slng.ai/api-keys.

Profiles

Credentials and settings live in named profiles, AWS-style. Run voiceai login (or voiceai config add <name>) to create one; switch with voiceai config use <name>; override per command with --profile <name> or VOICEAI_PROFILE=<name>.

voiceai login --profile work           # create / update the "work" profile
voiceai config profiles                # list all profiles (★ marks the current)
voiceai config use work                # persistent default
voiceai --profile default whoami       # one-off override
voiceai config remove staging          # delete a profile

The TUI's Settings → Profile menu does the same things interactively (add, switch, remove with confirmation).

Quick start

voiceai tts "Hello from Voiceai"               # synth + play locally
voiceai tts "Save this" --out hi.mp3           # save to a file
voiceai stt audio.wav                          # transcribe a file
voiceai stt --stream                           # live mic → transcripts

Interactive mode

voiceai with no args opens the TUI. It remembers your last-used model and voice in ~/.config/voiceai/config.json, so subsequent runs skip the pickers. The Settings → Profile menu lets you switch, add, or remove profiles without leaving the TUI.

TTS flow

Language: English ▼
Model:    ★ slng/deepgram/aura:2-en
Voice:    Amalthea · feminine · Engaging
Text:     Hello from Voiceai
          (enter to synthesize)

Slng-hosted models float to the top of the picker with a yellow ★. Per-model voice catalogs include name, gender, tone, and language so you're not picking from a wall of UUIDs.

STT flow

Model:  ★ slng/deepgram/nova:3-en
Source: 🎙  Microphone (realtime) | 📂 Audio file (one-shot)
Input:  MacBook Pro Microphone

● slng/deepgram/nova:3-en  (space to pause)
  Hello world how are you

Mic mode opens a WebSocket and streams 16-bit PCM frames; partial transcripts appear in dim italic, finals get appended. File mode does a one-shot HTTP upload.

Flag mode

Text → speech

# Friendly voice name resolves to the upstream voiceId.
voiceai tts "hi" -m slng/deepgram/aura:2-en -v amalthea

# Save to a path of your choice (audio still plays unless stdout is a pipe).
voiceai tts "save me" --out ~/voice.mp3

# Pipe raw audio bytes — useful in scripts.
voiceai tts "binary" > out.mp3

# Stream chunks via WebSocket for low-latency playback.
voiceai tts "stream me" --stream | ffplay -

# Pin a deployment region.
voiceai tts "regional" --region eu-north-1

Without --out, audio is also written to $TMPDIR/voiceai-tts/ so you can replay or re-export later.

Speech → text

# One-shot transcription of an audio file.
voiceai stt audio.wav -m slng/deepgram/nova:3-en

# Live mic → transcripts.
voiceai stt --stream

# Pipe raw 16-bit PCM (16 kHz mono) from any source.
arecord -f S16_LE -r 16000 -c 1 | voiceai stt --stream --source stdin

Catalogs

# All deployed models, both TTS and STT.
voiceai models

# Filter by service type and machine-readable output for scripts.
voiceai models --tts
voiceai models --json | jq '.tts[] | .id'

# Voices for a specific TTS model. --voice in `tts` accepts the friendly
# name from this list (case-insensitive).
voiceai voices --model slng/deepgram/aura:2-en
voiceai voices --model cartesia/sonic:3 --language fr
voiceai voices --model slng/deepgram/aura:2-en --json | jq '.[] | .name'

Auth check

# Verify VOICEAI_API_KEY against the agents API (no TTS/STT credits used).
voiceai whoami
voiceai whoami --json | jq .ok

Configuration

voiceai config get                         # print the current profile (apiKey masked)
voiceai config get defaultTtsModel         # single value
voiceai config set apiKey zpka_…           # write to the current profile
voiceai config set --profile work apiKey zpka_…   # write to a specific profile
voiceai config set defaultTtsModel slng/deepgram/aura:2-en
voiceai config set defaultTtsVoice amalthea
voiceai config profiles                    # list profiles (★ marks the current)
voiceai config use work                    # set persistent default
voiceai config add staging                 # add a profile interactively
voiceai config remove staging              # delete a profile
voiceai config reset --force               # wipe ~/.config/voiceai + legacy slng dir

Setting defaultTtsModel (and optionally defaultTtsVoice) skips the picker steps in the TUI. Same for defaultSttModel / defaultSttMode / defaultSttInput.

config reset is what brew uninstall won't do for you — Homebrew leaves files in ~/.config/ untouched. Run it before uninstalling, or any time you want the TUI to show the first-run API-key prompt again. Pass --all to also clear the $TMPDIR/voiceai-tts/ replay cache.

Configuration reference

~/.config/voiceai/config.json stores one or more named profiles:

{
  "currentProfile": "default",
  "profiles": {
    "default": { "apiKey": "zpka_…", "defaultTtsModel": "slng/deepgram/aura:2-en" },
    "work":    { "apiKey": "zpka_…", "baseUrl": "https://stageapi.slng.ai" }
  }
}

The file is written with mode 0600. Older flat-shaped configs auto-migrate into a default profile on first run.

Profile resolution precedence (highest wins): --profile <name> flag → VOICEAI_PROFILE env → currentProfile in the file → literal "default".

Per-profile keys (env overrides apply to the resolved profile):

| Key | Env override | Description | |---|---|---| | apiKey | VOICEAI_API_KEY | Bearer token (zpka_…). | | baseUrl | VOICEAI_BASE_URL | Override the API base URL (e.g. https://stageapi.slng.ai). | | region | — | Pin every request to a region (auto if unset). | | worldPart | — | Pin every request to a world-part (auto if unset). | | defaultTtsModel | — | Skip the TTS model picker in the TUI. | | defaultTtsVoice | — | Skip the TTS voice picker (requires defaultTtsModel). | | defaultSttModel | — | Skip the STT model picker. | | defaultSttMode | — | mic or file — skip the source picker. | | defaultSttInput | — | Audio input device for mic mode (skip device picker). |

Additional environment variables:

| Env var | Description | |---|---| | VOICEAI_PROFILE | Select a named profile (overridden by --profile). | | VOICEAI_AGENTS_BASE_URL | Override the agents API base URL (used by whoami and login). | | VOICEAI_LOG | debug for verbose SDK logging (also enabled by --debug). |

External audio dependencies

The CLI shells out to your system's audio tools rather than opening devices directly. Install whichever's appropriate:

macOS: afplay (built-in). For STT mic: brew install sox.
Linux: ffplay (apt install ffmpeg) or paplay. For STT mic: apt install sox or apt install alsa-utils.

Full SDKs (Node + Python) → voiceai-sdk on npm and PyPI
API reference → https://docs.slng.ai
Source → https://github.com/slng-ai/sdks/tree/main/cli