swictation
v0.7.27
Published
Cross-platform voice-to-text dictation for Linux and macOS with GPU acceleration (NVIDIA CUDA/CoreML), Secretary Mode (60+ natural language commands), Context-Aware Meta-Learning, and pure Rust performance. Meta-package that automatically installs platfor
Downloads
11,568
Maintainers
Readme
Swictation CLI
Command-line interface for managing the Swictation voice dictation daemon.
✨ Wayland Support - Fully Automated! ✨
NEW in v0.4.0: Swictation now fully supports Wayland-based Linux desktops with automated installation:
- ✅ GNOME Wayland (Ubuntu 25.10+) - Hotkeys auto-configured
- ✅ Sway - Text injection auto-setup
- ✅ GPU Acceleration - CUDA libraries auto-downloaded
- ✅ Service Management - Auto-enabled systemd service
One command installation:
npm install -g swictation --foreground-scripts
# Everything is configured automatically!→ Complete Wayland Support Guide
Features
- 🎤 Real-time voice transcription using Parakeet-TDT-1.1B (NVIDIA)
- 📝 Secretary Mode - 60+ natural language commands for dictation
- Say "comma" → Get ","
- Say "mr smith said quote hello quote" → Get "Mr. Smith said 'Hello'"
- Automatic capitalization, punctuation, numbers, quotes, formatting
- → Full Secretary Mode Guide
- 🔄 Smart text transformation - MidStream Rust library (~5µs latency)
- ⚡ Low latency - Pure Rust implementation with CUDA acceleration
- 🖥️ Wayland & X11 Support - Full dual display server support
- 🎯 Hotkey support - Auto-configured on GNOME, manual setup on Sway
- 📊 Real-time metrics - WPM, latency, GPU/CPU usage
- 🦀 Pure Rust daemon - Zero Python runtime dependencies
System Requirements
Required
- Operating System: Ubuntu 24.04 LTS or later (x64)
- Node.js: >= 18.0.0
- GLIBC: >= 2.39 (Ubuntu 24.04+)
- libasound2: Audio library for ALSA
# Ubuntu 24.04+ sudo apt-get install libasound2t64 # Ubuntu 22.04 and older sudo apt-get install libasound2
Optional (for GPU Acceleration)
- NVIDIA GPU: Any CUDA-capable GPU
- CUDA: 12.x or later
- VRAM: 4GB minimum (6GB+ recommended for 1.1B model)
- Driver: NVIDIA proprietary drivers (not nouveau)
Note: GPU acceleration libraries (~2.3GB CUDA + cuDNN) are automatically downloaded during installation. These include CUDA runtime and cuDNN libraries optimized for your GPU architecture.
If GPU is not available, Swictation will run in CPU-only mode (slower transcription).
Installation
One-Command Installation
Recommended: User-Local Installation (No sudo required)
If using nvm (Node Version Manager)
nvm already handles user-local installations automatically! No additional setup needed:
# Just install (nvm manages the prefix)
npm install -g swictation --foreground-scriptsThe installation will automatically:
- Download GPU acceleration libraries (~2.3GB, if NVIDIA GPU detected)
- Install and configure ydotool (Wayland text injection)
- Configure GNOME keyboard shortcuts (if on GNOME Wayland)
- Set up and enable systemd service
- Start the daemon service
Important: If you previously set a custom npm prefix, remove it:
npm config delete prefixIf NOT using nvm
Configure npm to install packages to your home directory:
# One-time setup: Configure npm to use user-local directory
mkdir -p ~/.npm-global
npm config set prefix ~/.npm-global
# Add to PATH (add this line to ~/.bashrc or ~/.zshrc)
export PATH="$HOME/.npm-global/bin:$PATH"
# Source your shell config to apply changes
source ~/.bashrc # or source ~/.zshrc
# Now install without sudo
npm install -g swictation --foreground-scriptsAlternative: System-Wide Installation (Requires sudo)
# Install with visible output
sudo npm install -g swictation --foreground-scripts
# Or standard install (output hidden by npm)
sudo npm install -g swictationWhy user-local installation is recommended:
- ✅ No sudo required
- ✅ No permission issues
- ✅ Each user has their own installation
- ✅ Cleaner and more secure
- ✅ Works with nvm seamlessly
Note: The --foreground-scripts flag makes the installation progress visible. The postinstall script will:
- Download GPU acceleration libraries (~2.3GB CUDA + cuDNN, if NVIDIA GPU detected)
- Install and configure ydotool (Wayland text injection)
- Set up uinput kernel module and permissions
- Configure GNOME keyboard shortcuts (if on GNOME Wayland)
- Generate systemd service files with GPU library paths
- Enable and start the daemon service
- Test GPU model loading (if GPU detected)
Installation Options
You can customize the installation with environment variables:
# Default: Standard install with GPU model testing (if GPU detected)
npm install -g swictation --foreground-scripts
# Skip model testing for CI/headless environments (faster install)
SKIP_MODEL_TEST=1 npm install -g swictation --foreground-scriptsInstallation behavior:
- Default: Model test-loading runs automatically when GPU is detected (~30s to verify)
- SKIP_MODEL_TEST=1: Skips model testing entirely (useful for CI/automated environments)
Note: Model testing is recommended to ensure your GPU setup works correctly. It verifies that the daemon can load models with your CUDA installation.
Quick Start
GNOME Wayland (Ubuntu 25.10+): Everything is auto-configured! Just start using it:
# Installation already configured everything, just start using:
swictation status # Check daemon is running
# Press Super+Shift+D to toggle recording (already configured!)Sway/Other Compositors: Add hotkey binding to your config (see Hotkey Configuration section), then:
swictation status # Check daemon is running
# Use your configured hotkey to toggle recordingManual Setup (if needed):
swictation setup # Configure systemd service
swictation start # Start the daemon
swictation toggle # Toggle recording via commandOptional UI:
swictation start --ui # Launch with visual interfaceCommands
swictation start [--ui]- Start the daemon (and optionally the UI)swictation stop- Stop the daemonswictation status- Show service statusswictation toggle- Toggle recording on/offswictation setup- Configure systemd and hotkeysswictation help- Show help message
Display Server Support
Fully Supported (Automated Setup)
- ✅ GNOME Wayland (Ubuntu 25.10+) - Keyboard shortcuts auto-configured
- ✅ Sway - Text injection auto-setup, manual hotkey configuration
- ✅ X11 - Traditional X11 support with xdotool
System Requirements
- OS: Linux x64
- Distribution: Ubuntu 24.04 LTS or newer (GLIBC 2.39+)
- ✅ Ubuntu 24.04 LTS (Noble Numbat)
- ✅ Ubuntu 25.10+ (Questing Quetzal) - Recommended for Wayland
- ✅ Debian 13+ (Trixie)
- ✅ Fedora 39+
- ❌ Ubuntu 22.04 LTS - NOT supported (GLIBC 2.35 too old)
- Node.js: 18.0.0 or higher
- Storage: ~12 GB total (AI models + GPU libraries)
- GPU: NVIDIA with 4GB+ VRAM (CUDA 12.x) or CPU-only mode
GPU Requirements (1.1B Model - Recommended)
For optimal performance with the 1.1B model (62.8x realtime speed):
- GPU: NVIDIA GPU with 4GB+ VRAM
- CUDA: 11.8+ or 12.x
- Compute Capability: 7.0+ (Turing architecture or newer)
- ONNX Runtime: onnxruntime-gpu 1.16.0+
GPU Acceleration
GPU acceleration libraries (CUDA + cuDNN) are automatically downloaded during npm installation. No manual CUDA installation required!
Runtime Dependencies
Automatically Installed by npm postinstall:
- ydotool (Wayland text injection) - Auto-installed on Wayland systems
- netcat (IPC socket communication) - Auto-installed
Pre-installed on most systems:
- GLIBC 2.39+ (Ubuntu 24.04+)
- libasound2 (ALSA sound library)
- systemd (service management)
- PipeWire or PulseAudio (audio capture)
For X11 systems (optional):
sudo apt install xdotool # Ubuntu/Debian
sudo pacman -S xdotool # Arch Linux
sudo dnf install xdotool # FedoraConfiguration
Configuration file is located at ~/.config/swictation/config.toml
Hotkey Configuration
GNOME Wayland: ✅ Automatically configured during installation!
- Default hotkey:
Super+Shift+D - Visible in: Settings → Keyboard → View and Customize Shortcuts
- No manual configuration needed
Sway - Add to ~/.config/sway/config:
# Swictation toggle
bindsym $mod+Shift+d exec swictation toggle
# Optional: Push-to-talk (hold to record)
bindsym $mod+Space exec swictation ptt-press
bindsym --release $mod+Space exec swictation ptt-releaseThen reload: swaymsg reload
X11 (i3, etc.) - Add to config:
bindsym Mod4+Shift+d exec swictation toggle→ See Complete Wayland Support Guide for detailed setup and troubleshooting.
Architecture
Runtime Components
Swictation consists of three main components:
- Daemon (
swictation-daemon) - Rust service handling audio capture and transcription - UI (
swictation-ui) - Tauri application for metrics and control - CLI (
swictation) - Node.js command-line interface
Package Architecture
Swictation uses a platform-specific package architecture similar to esbuild and swc:
swictation (main package)
├── @agidreams/linux-x64 (Linux x86_64 binaries)
└── @agidreams/darwin-arm64 (macOS Apple Silicon binaries)How it works:
Automatic Platform Detection: When you run
npm install -g swictation, npm automatically detects your platform (Linux x64 or macOS ARM64)Platform Package Installation: npm installs the main
swictationpackage PLUS the correct platform package for your system viaoptionalDependenciesBinary Resolution: The CLI wrapper automatically finds and uses the binaries from your platform package
Example installation flow on Linux:
npm install -g swictation
# npm automatically installs:
# - swictation (main package with CLI)
# - @agidreams/linux-x64 (platform binaries)What's in each package:
| Package | Contents |
|---------|----------|
| swictation | CLI wrapper, postinstall scripts, configuration |
| @agidreams/linux-x64 | Linux ELF binaries, ONNX Runtime base library |
| @agidreams/darwin-arm64 | macOS Mach-O ARM64 binaries, ONNX Runtime with CoreML |
Benefits:
- ✅ Smaller downloads - only your platform's binaries are installed
- ✅ Cleaner architecture - binaries grouped by platform
- ✅ Better CI/CD - each platform built independently
- ✅ Same simple install command -
npm install -g swictationjust works
Versioning Strategy
Swictation uses semantic versioning with a two-level approach:
Distribution Version (user-facing):
0.7.9- The version you see in
npm install [email protected] - Synchronized across all three packages (main + both platform packages)
- Single version number for the entire distribution
- The version you see in
Component Versions (internal):
- Daemon:
0.7.5(Rust binary version) - UI:
0.1.0(Tauri application version) - ONNX Runtime:
1.22.0(macOS),1.23.2(Linux GPU)
- Daemon:
Version Synchronization:
All package versions are automatically synchronized via npm-package/versions.json:
{
"distribution": "0.7.9",
"components": {
"daemon": { "version": "0.7.5" },
"ui": { "version": "0.1.0" }
}
}Why separate versions?
- Components evolve at different rates (UI changes rarely, daemon changes often)
- Users see ONE simple version number for the distribution
- Developers track component versions internally for debugging
- Platform packages use distribution version for npm publishing
Checking versions:
swictation --version
# Shows distribution version: 0.7.9
swictation-daemon --version
# Shows daemon component version: 0.7.5Verification
After installation, verify everything is working:
# Run comprehensive verification script
./node_modules/swictation/scripts/verify-installation.sh
# Or check manually
swictation status # Check daemon status
systemctl --user status swictation-daemon.service
journalctl --user -u swictation-daemon -n 50 # View logsExpected on Wayland:
- ✓ Wayland detected
- ✓ GNOME/Sway detected (as applicable)
- ✓ GPU libraries installed (if NVIDIA GPU present)
- ✓ ydotool installed
- ✓ uinput module loaded
- ✓ User in 'input' group
- ✓ GNOME keyboard shortcut configured (if GNOME)
- ✓ Service enabled and running
- ✓ IPC socket responding
Troubleshooting
Wayland-Specific Issues
Text not typing after transcription:
# 1. Verify ydotool permissions
sg input -c 'ydotool type "test"'
# 2. Check group membership (may need logout/login)
groups | grep input
# 3. Verify uinput permissions
ls -l /dev/uinput
# Should show: crw-rw---- 1 root input
# 4. Re-run setup if needed
./node_modules/swictation/scripts/setup-ydotool.shGNOME: Hotkey not working (types "D" instead):
# Verify shortcut configuration
gsettings get org.gnome.settings-daemon.plugins.media-keys custom-keybindings
# Re-configure if needed
./node_modules/swictation/scripts/setup-gnome-shortcuts.sh
# Check in Settings → Keyboard → View and Customize Shortcuts
# Look for "Swictation Toggle"Slow transcription (20-30 seconds instead of <1 second): This means CPU mode instead of GPU. Always start daemon via systemctl, not manually!
# Restart via systemctl (has correct LD_LIBRARY_PATH)
systemctl --user restart swictation-daemon.service
# Check logs for GPU detection
journalctl --user -u swictation-daemon.service | grep CUDA
# Expected: "Successfully registered `CUDAExecutionProvider`"
# Expected: "cuDNN version: 91501"Platform Package Issues
"Platform package @agidreams/linux-x64 not found"
Cause: npm's optionalDependencies failed to install the platform package.
Solution:
# Force reinstall with clean cache
npm install -g swictation --force
# Or manually install platform package
npm install -g @agidreams/linux-x64
npm install -g swictationVerify installation:
# Check that platform package is installed
npm list -g @agidreams/linux-x64 # Linux
npm list -g @agidreams/darwin-arm64 # macOS
# Check binary locations
swictation --version
# Should show platform package location"Unsupported platform: win32-x64"
Cause: Swictation currently supports Linux x64 and macOS ARM64 only.
Supported platforms:
- ✅ Linux x86_64 (Ubuntu 24.04+, Debian 13+, Fedora 39+)
- ✅ macOS Apple Silicon (ARM64)
- ❌ Windows (planned for future release)
- ❌ macOS Intel (planned for future release)
- ❌ Linux ARM64 (planned for future release)
"Platform package installed but binaries are missing"
Cause: Platform package build was incomplete or corrupted.
Solution:
# Reinstall platform package
npm uninstall -g @agidreams/linux-x64
npm install -g @agidreams/linux-x64 --force
# Verify binaries exist
ls -lh ~/.npm-global/lib/node_modules/@agidreams/linux-x64/bin/
# Should show: swictation-daemon, swictation-ui"Wrong architecture - ELF 64-bit instead of Mach-O"
Cause: Wrong platform package installed (Linux package on macOS or vice versa).
Solution:
# Remove wrong platform package
npm uninstall -g @agidreams/linux-x64
npm uninstall -g @agidreams/darwin-arm64
# Clean install with correct platform detection
npm install -g swictation --forceGeneral Installation Issues
"Cannot see postinstall output"
Solution: Use the --foreground-scripts flag:
sudo npm install -g swictation --foreground-scripts"GPU libraries download failed (404)"
Cause: GitHub release for this version doesn't exist yet. Solution: This only happens with unreleased versions. Published npm versions will work.
"libasound.so.2: cannot open shared object file"
Cause: Missing ALSA library. Solution:
# Ubuntu 24.04+
sudo apt-get install libasound2t64
# Ubuntu 22.04 and older
sudo apt-get install libasound2"Model test-loading failed during installation"
Cause: GPU not detected or CUDA libraries missing. Solution: This is non-fatal. Swictation will fall back to CPU mode. To verify GPU setup after installation:
# Check if CUDA is available
nvidia-smi
# Check CUDA libraries
ls /usr/local/cuda/lib64/Service won't start
# Check status
swictation status
# View logs
journalctl --user -u swictation-daemon -fNo audio input
# List audio devices
arecord -l
# Test microphone
arecord -d 5 test.wav && aplay test.wavText not being typed
- Wayland: See "Wayland-Specific Issues" section above
- X11: Ensure
xdotoolis installed - Check logs:
journalctl --user -u swictation-daemon -f
Audio device issues
Only 1 audio device detected (should see 4+): The systemd service should automatically import PulseAudio/PipeWire environment variables. Verify the service is running:
systemctl --user status swictation-daemon.serviceReal-time transcription not working (only transcribes at end)
Cause: VAD threshold too low - background noise exceeds threshold.
Solution: Adjust in ~/.config/swictation/config.toml:
vad_threshold = 0.25 # Default optimized for real-time
vad_min_silence = 0.8 # Seconds of silence before flushLower threshold (0.003) causes continuous speech detection. Optimal is 0.25 for balanced sensitivity.
After fixing environment variables
Always reload systemd and restart the daemon:
systemctl --user daemon-reload
systemctl --user restart swictation-daemon
# Verify it started correctly
systemctl --user status swictation-daemon
journalctl --user -u swictation-daemon -n 50Platform Support
| Platform | Status | Notes | |----------|--------|-------| | Linux Wayland (GNOME) | ✅ Fully Supported | Auto-configured hotkeys, ydotool setup | | Linux Wayland (Sway) | ✅ Fully Supported | Auto ydotool setup, manual hotkeys | | Linux X11 | ✅ Supported | Traditional X11 support | | NVIDIA GPU Acceleration | ✅ Supported | Auto-downloaded CUDA + cuDNN | | Linux AMD GPU | 🚧 Planned | ROCm support | | macOS (Apple Silicon/Intel) | 🚧 Planned | CoreML/Metal execution providers | | Windows + NVIDIA | 🚧 Planned | CUDA support | | Windows + AMD/Intel | 🚧 Planned | DirectML execution provider |
Development
Source code: https://github.com/yourusername/swictation
Building from source
# Clone repository
git clone https://github.com/yourusername/swictation
cd swictation
# Build Rust components
cd rust-crates
cargo build --release
# Build Tauri UI
cd ../tauri-ui
npm install
npm run buildLicense
MIT
Contributing
Contributions are welcome! Please read our Contributing Guide for details.
Additional Resources
- Wayland Support Guide - Complete Wayland setup and troubleshooting
- Secretary Mode Guide - Natural language commands reference
- Testing & Validation Guide - Display server testing procedures
Support
- Issues: https://github.com/robertelee78/swictation/issues
- Discussions: https://github.com/robertelee78/swictation/discussions
- Source Code: https://github.com/robertelee78/swictation
