@quak.lib/voice-assistant-js

v1.0.0

Published

9 days ago

Local voice assistant pipeline: mic → whisper → ollama → tts. Your own Jarvis, zero cloud.

Downloads

104

0High
0Medium
0Low

quak.lib

voice assistant jarvis whisper ollama tts piper local-ai speech voice-agent llm offline

@quak.lib/voice-assistant-js

Your own local Jarvis. Zero cloud. Zero API keys. Just your voice.

🎤 you speak  →  🧠 AI thinks  →  🔊 it responds

Built on Whisper + Ollama + Piper TTS. Runs entirely on your machine.

Quick start

npm install @quak.lib/voice-assistant-js

const { VoiceAssistant } = require("@quak.lib/voice-assistant-js");

const assistant = new VoiceAssistant();
assistant.start();

🎤 Listening...
You: what's the capital of France
AI: Paris is the capital of France.
🔊 ...
---
🎤 Listening...

Prerequisites

You need three things installed locally:

1. Ollama (the brain)

# Install from https://ollama.com
ollama pull llama3

2. Piper TTS (the voice)

Download the binary for your platform from the Piper releases page, then add it to your PATH.

Download a voice model (.onnx file) from Hugging Face and place it in ~/.local/share/piper-voices/.

# Example: download the default voice
mkdir -p ~/.local/share/piper-voices
wget https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx \
  -O ~/.local/share/piper-voices/en_US-lessac-medium.onnx

3. faster-whisper (the ears)

pip install faster-whisper

Verify everything works

npx @quak.lib/voice-assistant-js check

API

`new VoiceAssistant(options?)`

| Option | Type | Default | Description | |---|---|---|---| | model | string | "llama3" | Ollama model to use | | voice | string | "en_US-lessac-medium" | Piper voice model name | | listenDuration | number | 4000 | Microphone recording time (ms) | | systemPrompt | string | (built-in) | System prompt for the LLM | | verbose | boolean | false | Enable debug logging |

Methods

assistant.start() — Start a continuous listen → think → speak loop. Runs until assistant.stop() or Ctrl+C.

assistant.once() — Run exactly one cycle. Returns { input, response }.

assistant.stop() — Stop the continuous loop.

CLI

# Start the assistant
npx @quak.lib/voice-assistant-js

# One-shot mode
npx @quak.lib/voice-assistant-js --once

# Custom model and voice
npx @quak.lib/voice-assistant-js --model mistral --voice en_US-ryan-high

# Longer listening window
npx @quak.lib/voice-assistant-js --duration 6000

# Check dependencies
npx @quak.lib/voice-assistant-js check

Examples

Basic loop

const { VoiceAssistant } = require("@quak.lib/voice-assistant-js");

const assistant = new VoiceAssistant();
assistant.start();

Custom model and persona

const { VoiceAssistant } = require("@quak.lib/voice-assistant-js");

const assistant = new VoiceAssistant({
  model: "mistral",
  voice: "en_US-ryan-high",
  listenDuration: 6000,
  systemPrompt: "You are a sarcastic assistant. Keep answers under two sentences.",
});

assistant.start();

One-shot (great for scripts)

const { VoiceAssistant } = require("@quak.lib/voice-assistant-js");

const assistant = new VoiceAssistant({ model: "llama3" });

const { input, response } = await assistant.once();
console.log(`You said: "${input}"`);
console.log(`AI said: "${response}"`);

Pipeline

Microphone
    │
    ▼
listen.js   — records audio via mic npm package
    │
    ▼
scripts/whisper.py  — transcribes with faster-whisper (Python)
    │
    ▼
src/llm.js  — sends text to Ollama, gets response
    │
    ▼
src/tts.js  — converts response to speech via Piper
    │
    ▼
Speaker

Environment variables

| Variable | Default | Description | |---|---|---| | OLLAMA_URL | http://localhost:11434 | Ollama server URL | | PIPER_MODELS_DIR | ~/.local/share/piper-voices | Directory with .onnx voice files | | WHISPER_MODEL | base | Whisper model size (tiny, base, small, medium, large) |

Roadmap

[ ] Voice Activity Detection (VAD) — stop recording on silence instead of fixed timer
[ ] Streaming responses — speak while the LLM is still generating
[ ] Wake word support ("hey computer") via Porcupine
[ ] Tool use — let the assistant run shell commands, open URLs, search files
[ ] macOS / Windows installer scripts

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@quak.lib/voice-assistant-js

Quick start

Prerequisites

1. Ollama (the brain)

2. Piper TTS (the voice)

3. faster-whisper (the ears)

Verify everything works

API

new VoiceAssistant(options?)

Methods

CLI

Examples

Basic loop

Custom model and persona

One-shot (great for scripts)

Pipeline

Environment variables

Roadmap

License

`new VoiceAssistant(options?)`