npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, πŸ‘‹, I’m Ryan HefnerΒ  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you πŸ™

Β© 2026 – Pkg Stats / Ryan Hefner

ava-listener

v0.1.2

Published

<div align="center"> <h1>πŸŽ™οΈ AVA-Listener</h1> <p><strong>AVA-Listener is an offline, transcription-driven wake phrase runtime designed for flexible custom AI assistants without requiring model retraining.</strong></p>

Readme

npm version npm downloads GitHub release License: MIT Node.js CI Python


πŸš€ Quick Start

Get up and running with zero manual model installation or runtime setup. Built-in profiles are intended as starting templates. Copy and customize them rather than editing files inside node_modules.

Jump to: Getting Started with Profiles | Profile Schema | Debug Mode

Step 1: Install

npm install ava-listener
npx ava-listener setup

What setup does automatically:

  • Creates required runtime directories
  • Verifies package structure
  • Downloads required speech models
  • Validates model SHA256 hashes
  • Prepares the local runtime cache

Step 2: Copy a built-in profile

AVA-Listener ships with built-in starter profiles:

node_modules/
└── ava-listener/
    └── profiles/
        β”œβ”€β”€ arvsal.json
        β”œβ”€β”€ jarvis.json
        β”œβ”€β”€ base.json
        └── custom.json

Choose one and copy it into your own project:

Windows:

Copy-Item `
node_modules\ava-listener\profiles\arvsal.json `
.\arvsal.json

Linux/macOS:

cp node_modules/ava-listener/profiles/arvsal.json ./arvsal.json

You can now edit:

  • assistantName
  • wake phrases
  • variants
  • thresholds
  • cooldown values

Step 3: Start listening

const { AVAListener } = require("ava-listener");
const path=require("path");

async function run(){

    const listener=new AVAListener({
        profile:path.join(
            __dirname,
            "arvsal.json"
        ),
        debug:true
    });

    listener.on(
        "bootstrap-start",
        ()=>console.log(
            "[BOOTSTRAP]"
        )
    );

    listener.on(
        "runtime-ready",
        ()=>console.log(
            "[RUNTIME READY]"
        )
    );

    listener.on(
        "wake",
        (e)=>console.log(
            "Wake:",
            e
        )
    );

    listener.on(
        "partial",
        (t)=>console.log(
            "[ASR]",
            t
        )
    );

    listener.on(
        "error",
        console.error
    );

    await listener.start();

}

run().catch(console.error);

debug:true enables live transcription logs and is recommended during initial wake phrase tuning.

Disable debug mode in production deployments.

Useful events exposed by AVA-Listener:

  • bootstrap-start
  • runtime-ready
  • wake
  • partial
  • error

πŸ“¦ Model Storage

Downloaded models are cached locally and reused automatically.

Windows:

%LOCALAPPDATA%/AVAListener/models/

Linux/macOS:

~/.local/share/AVAListener/models/
  • Models download only once
  • Future startups reuse cache
  • Users may manually delete cache if they want forced redownloads

πŸ“– Why it is called AVA-Listener

AVA stands for ARVSAL Voice Activation.

ARVSAL (Autonomous Response and Virtual System Layer) is the personal AI assistant system created my me that originally motivated this project. AVA-Listener began as the listening and wake-word layer of ARVSAL. Over time, it evolved into an independent, reusable runtime for custom voice activation, capable of powering any assistant without locking you into a single assistant name.


πŸ€” The Problem with Custom Wake Words

When building local AI assistants, AVA-Listener was designed to address the limits of current wake-word tooling.

  • Picovoice Porcupine: An excellent project, but it has increasingly shifted toward enterprise workflows and introduces access friction for individual developers.
  • OpenWakeWord: A strong open-source solution for many standard phrases, but custom uncommon words often require a training workflow or dataset creation.

Custom phrases such as:

  • "ARVSAL"
  • "Jarvis"
  • "Activate Protocol"
  • "Computer Prime"
  • "Project Athena"

can be difficult because ASR often transcribes them differently. This happens due to:

  • Pronunciation ambiguity
  • Uncommon phonetics
  • Accent variations
  • Transcription drift

πŸ”€ Why "ARVSAL" is difficult

ARVSAL is not a common English word. Most ASR engines will have difficulty transcribing it accurately, especially across different speakers, accents, and environmental conditions.

A speech sample of someone saying "ARVSAL" might be transcribed as:

  • "arvsal"
  • "arsal"
  • "arsel"
  • "aircel"
  • "our whistle"

Why does this happen?

  • Uncommon phonetics β€” ARVSAL has no common English phoneme patterns.
  • Accent variations β€” Different speakers pronounce it differently.
  • Speech rate β€” Fast or slow speech changes how phonemes map to tokens.
  • Background noise β€” Microphone quality and ambient sound affect transcription.
  • ASR token ambiguity β€” The model may emit different token sequences for the same utterance.

Instead of forcing users to record training data and retrain an acoustic model, AVA-Listener embraces this challenge through variants.

You define ARVSAL once, then register likely alternatives the ASR might produce. The runtime then matches transcriptions against all registered variants and fires a wake event when confidence is high.

This approach is faster to configure, requires no data collection, and works immediately.


🧠 AVA Philosophy

AVA does not depend on training a neural model per wake word.

AVA-Listener uses streaming ASR as the foundation, then applies transcription matching and fuzzy phrase logic.

Pipeline: Speech β†’ ASR β†’ Transcription β†’ Variants β†’ Scoring β†’ Confidence Filter β†’ Event

This is the core design philosophy of the package. It means wake phrases are defined as text and variants, not as new acoustic models. That gives you fast iteration and flexible phrase control without dataset collection.


πŸ•ŠοΈ Wake Phrase Freedom

AVA-Listener is built for free-form wake phrase design.

Supported phrase styles:

  • Single words: "jarvis", "computer", "echo"
  • Multiple words: "activate protocol", "hello assistant"
  • Complete sentences: "hello arvsal can you wake up"
  • Fictional names: "ultron", "hal"
  • Invented words: "arvsal", "snoodle"
  • Technical commands: "start diagnostic mode", "shutdown system"

No retraining. No dataset creation. No hundreds of recordings.


βš™οΈ Advanced Wake Phrase Logic

AVA-Listener combines multiple runtime controls:

  1. Phrase β€” the canonical target text.
  2. Variants β€” alternate ASR transcriptions.
  3. Threshold β€” per-phrase trigger sensitivity.
  4. EMA smoothing β€” reduces false spikes.
  5. Cooldown β€” prevents repeated triggers.
  6. Debug mode β€” helps you tune phrases quickly.

πŸ—οΈ Architecture

AVA-Listener orchestrates several subsystems to detect custom wake phrases offline.

Speech Processing Pipeline

graph TD
    A[Microphone] --> B[Audio Buffer]
    B --> C[Silero VAD]
    C --> D[Streaming ASR]
    D --> E[Phrase Variants]
    E --> F[Confidence Filter]
    F --> G[Cooldown]
    G --> H[Wake Event]

Startup Flow

graph TD
    A[npm install] --> B[npx ava-listener setup]
    B --> C[Runtime validation]
    C --> D[Model verification]
    D --> E[Runtime startup]
    E --> F[Listening Ready]

πŸš€ Package Usage

The canonical package workflow is the real SDK usage that ships with ava-listener. Attach listeners before calling start() so lifecycle, bootstrap, download, runtime, and wake events are all captured.

const { AVAListener } = require("ava-listener");
const path = require("path");

async function run() {

    const listener = new AVAListener({
        profile: path.join(
            __dirname,
            "arvsal.json"
        ),
        debug: true
    });

    listener.on(
        "bootstrap-start",
        () => console.log("[BOOTSTRAP]")
    );

    listener.on(
        "download-progress",
        (x) => console.log(
            "[DOWNLOAD]",
            x
        )
    );

    listener.on(
        "runtime-ready",
        () => console.log("[RUNTIME READY]")
    );

    listener.on(
        "wake",
        (e) => console.log(
            "\nWAKE:",
            e
        )
    );

    listener.on(
        "partial",
        (t) => console.log(
            "[ASR]",
            t
        )
    );

    listener.on(
        "error",
        (e) => console.error(
            "[ERROR]",
            e
        )
    );

    await listener.start();
}

run().catch(console.error);

Event System

AVA-Listener emits runtime events during startup, model management, ASR streaming, wake detection, and runtime errors.

| Event | Description | | :--- | :--- | | bootstrap-start | Runtime startup sequence begins | | download-progress | Model download/update progress | | runtime-ready | Runtime initialized | | wake | Wake phrase detected | | partial | Live streaming ASR text | | error | Runtime errors |

Creating Profiles

Profiles are JSON files that define the assistant name, profile version, and wake phrase registry.

{
  "assistantName":"AssistantName",
  "profileVersion":1,
  "wakePhrases":[]
}

Field reference:

  • assistantName β€” human-friendly assistant label shown in diagnostics and logging.
  • profileVersion β€” schema version for profile validation.
  • wakePhrases β€” array of phrase definitions.
  • phraseId β€” unique identifier for each wake phrase.
  • phrase β€” canonical target text for the wake phrase.
  • variants β€” alternate ASR transcriptions that should also trigger the same phrase.
  • threshold β€” per-phrase trigger sensitivity.
  • cooldownMs β€” minimum time in milliseconds before the same phrase may trigger again.
  • enabled β€” whether the phrase is active.

πŸ§ͺ ARVSAL Example

This is the real ARVSAL profile from profiles/arvsal.json.

{
  "assistantName": "Arvsal",
  "profileVersion": 1,
  "wakePhrases": [
    {
      "phraseId": "arvsal_core",
      "phrase": "arvsal",
      "variants": [
        "arvsal",
        "arsal",
        "arzal",
        "arsel",
        "armsel",
        "arv sal",
        "ar sal",
        "our whistle",
        "or whistle",
        "ourvsel",
        "aircel",
        "ahsal",
        "arv"
      ],
      "threshold": 0.72,
      "cooldownMs": 2000,
      "enabled": true
    },
    {
      "phraseId": "hey_arvsal",
      "phrase": "hey arvsal",
      "variants": [
        "hey arvsal",
        "hey arsal",
        "hey arsel",
        "hey armsel",
        "hey arzal",
        "hey ar sal",
        "he arbezal",
        "hey our whistle",
        "hey or whistle",
        "wake up our whistle",
        "wake upon whistle"
      ],
      "threshold": 0.68,
      "cooldownMs": 2000,
      "enabled": true
    },
    {
      "phraseId": "wake_up_arvsal",
      "phrase": "wake up arvsal",
      "variants": [
        "wake up arvsal",
        "wake up arsal",
        "wake up arsel",
        "wake up our whistle",
        "wake upon whistle",
        "wreak up arvsal",
        "wreak up arsel",
        "wreak up our whistle"
      ],
      "threshold": 0.68,
      "cooldownMs": 2000,
      "enabled": true
    },
    {
      "phraseId": "listen_arvsal",
      "phrase": "listen arvsal",
      "variants": [
        "listen arvsal",
        "listen arsal",
        "listen arsel",
        "listen our whistle"
      ],
      "threshold": 0.72,
      "cooldownMs": 2000,
      "enabled": true
    },
    {
      "phraseId": "listen_buddy",
      "phrase": "listen buddy",
      "variants": [
        "list buddy",
        "listen bud",
        "listen bad",
        "listen badie"
      ],
      "threshold": 0.72,
      "cooldownMs": 2000,
      "enabled": true
    },
    {
      "phraseId": "listen",
      "phrase": "listen",
      "variants": [
        "listen",
        "his son",
        "son"
      ],
      "threshold": 0.72,
      "cooldownMs": 2000,
      "enabled": true
    }
  ],
  "extends": "base.json"
}

Why variants exist

ARVSAL is an uncommon name. ASR may interpret it as:

  • arvsal
  • arsal
  • arsel
  • our whistle
  • aircel

By registering these variants, the listener becomes robust to transcription drift.

πŸ€– Jarvis Example

This is the real Jarvis profile from profiles/jarvis.json.

{
  "assistantName": "Jarvis",
  "profileVersion": 1,
  "wakePhrases": [
    {
      "phraseId": "jarvis_core",
      "phrase": "jarvis",
      "variants": [
        "jarvis",
        "jarvas",
        "jarbes",
        "jarvus",
        "jarbus",
        "jar vis"
      ],
      "threshold": 0.72,
      "cooldownMs": 2000,
      "enabled": true
    },
    {
      "phraseId": "hey_jarvis",
      "phrase": "hey jarvis",
      "variants": [
        "hey jarvis",
        "hey jarvas",
        "hey jarbes",
        "hey jarvus",
        "hey jar vis"
      ],
      "threshold": 0.68,
      "cooldownMs": 2000,
      "enabled": true
    },
    {
      "phraseId": "ok_jarvis",
      "phrase": "ok jarvis",
      "variants": [
        "ok jarvis",
        "okay jarvis",
        "ok jarvas",
        "okay jar vis"
      ],
      "threshold": 0.68,
      "cooldownMs": 2000,
      "enabled": true
    }
  ],
  "extends": "base.json"
}

Multiple Wake Phrases

AVA supports many simultaneous phrases in the same profile. For example:

  • arvsal
  • hey arvsal
  • wake up arvsal
  • jarvis
  • computer
  • activate protocol
  • diagnostic mode

Each phrase can have its own sensitivity, cooldown, and variant set. That makes the runtime scalable across friendly names, natural commands, and custom assistant invocations.

πŸ”§ Advanced Event Usage

The SDK is event-driven and supports runtime control with profile and phrase updates.

const { AVAListener } = require("ava-listener");
const path = require("path");

const listener = new AVAListener({
  profile: path.join(__dirname, "jarvis.json"),
  debug: true
});

listener.on("bootstrap-start", () => console.log("Bootstrap started"));
listener.on("runtime-ready", () => console.log("Runtime ready"));
listener.on("wake", (event) => console.log("Wake detected", event));
listener.on("partial", (text) => console.log("Partial transcript", text));
listener.on("error", (err) => console.error("Runtime error", err));

await listener.start();

await listener.loadProfile(path.join(__dirname, "arvsal.json"));

listener.addPhrase({
  phraseId: "activate_protocol",
  phrase: "activate protocol",
  variants: ["activate protocol", "activate pro to call"],
  threshold: 0.70,
  cooldownMs: 2000,
  enabled: true
});

listener.updateConfig({
  "confidence.defaultThreshold": 0.78
});

πŸ“‚ Getting Started with Profiles

AVA-Listener includes built-in profiles:

node_modules/ava-listener/profiles/

Examples:

profiles/
β”œβ”€β”€ arvsal.json
β”œβ”€β”€ jarvis.json
β”œβ”€β”€ base.json
β”œβ”€β”€ custom.json

Users can:

  • use existing profiles directly
  • copy existing profiles
  • modify them
  • create entirely new profiles

Recommended workflow:

Step 1: Copy:

cp node_modules/ava-listener/profiles/arvsal.json ./myassistant.json

Windows:

Copy-Item `
node_modules\ava-listener\profiles\arvsal.json `
.\myassistant.json

Step 2: Modify:

  • assistantName
  • phrases
  • variants
  • thresholds

Step 3: Pass profile path:

const listener=new AVAListener({
    profile:"./myassistant.json"
});

🧩 Profile Schema

{
  "assistantName":"Arvsal",
  "profileVersion":1,
  "wakePhrases":[
    {
      "phraseId":"arvsal_core",
      "phrase":"arvsal",
      "variants":[
        "arvsal"
      ],
      "threshold":0.72,
      "cooldownMs":2000,
      "enabled":true
    }
  ]
}

| Field | Type | Description | | -------------- | -------- | ----------------------------- | | assistantName | string | Assistant display name | | profileVersion | integer | Profile format version | | phraseId | string | Unique identifier | | phrase | string | Main wake phrase | | variants | string[] | Alternative ASR outputs | | threshold | float | Match confidence threshold | | cooldownMs | integer | Ignore period after detection | | enabled | boolean | Enable/disable phrase |


✨ Creating Custom Profiles

Examples of custom profiles you can create:

  • Jarvis
  • ARVSAL
  • Computer
  • Athena
  • Activate Protocol

Users can create unlimited profiles.


πŸ›  Debug Mode

Debug mode prints live transcription information.

Enable:

const listener=new AVAListener({
    profile:"./arvsal.json",
    debug:true
});

When to use: βœ… Creating new wake words βœ… Improving accuracy βœ… Investigating false negatives βœ… Understanding ASR outputs βœ… Tuning thresholds

When NOT to use: ❌ Production deployment ❌ Minimal logging environments


πŸ“ˆ How Debug Improves Accuracy

Enable debug ↓ Speak phrase ↓ Observe:

[ASR] our whistle

↓ Recognize transcription drift ↓ Add:

"our whistle"

to:

variants:[]

↓ Retest

This is the recommended workflow for tuning uncommon words like: ARVSAL, Jarvis, Athena, Ultron, Computer Prime, etc.


🎯 Best Practices

  • Start with low phrase count
  • Enable debug during setup
  • Add common ASR mistakes to variants
  • Tune threshold slowly
  • Avoid extremely short one-syllable words
  • Use cooldowns to prevent retriggers
  • Disable debug in production

πŸ’‘ Understanding Wake Profiles

A wake profile is not a separate acoustic model for each phrase. Instead, AVA-Listener uses ASR output and variant matching so that:

  • speech is transcribed by Sherpa-ONNX,
  • text is compared against the canonical phrase,
  • alternate transcriptions are accepted via variants,
  • scores are filtered by threshold,
  • wake events are emitted when confidence is high.

This design avoids the need to train a different model for every new wake phrase.


πŸ› οΈ All User Controls

AVA-Listener exposes rich configuration at the SDK and profile levels.

SDK Initialization Options

Passed to new AVAListener(options):

| Name | Type | Default | Description | Example | | :--- | :--- | :--- | :--- | :--- | | debug | Boolean | false | Enable SDK debug logging. | true | | profile | String | null | Path to load a JSON profile from startup. | "./profiles/jarvis.json" | | startPaused | Boolean | false | Start in READY but do not activate detection. | true |

Profile Options

Defined in JSON and loaded via listener.loadProfile(path).

When a child profile extends a parent, deep objects are merged automatically, but the wakePhrases array is replaced entirely by the child profile.

| Name | Type | Default | Description | Example | | :--- | :--- | :--- | :--- | :--- | | extends | String | null | Parent profile path for inheritance. | "base.json" | | assistantName | String | "Jarvis" | Human-friendly assistant label. | "ARVSAL" | | vad.sileroThreshold | Float | 0.15 | VAD confidence threshold for Silero. | 0.20 | | vad.aggressiveness | Integer | 1 | VAD aggressiveness level. | 2 | | asr.numThreads | Integer | 2 | Sherpa-ONNX thread count. | 4 | | confidence.defaultThreshold | Float | 0.78 | Fallback phrase similarity threshold. | 0.80 | | confidence.emaRiseAlpha | Float | 0.70 | EMA rise smoothing factor. | 0.85 | | confidence.emaDecayAlpha | Float | 0.30 | EMA decay smoothing factor. | 0.15 | | confidence.cooldownSeconds | Float | 2.0 | Global cooldown after a trigger. | 3.0 | | transcription.enableDebug | Boolean | false | Emit live transcription diagnostics. | true | | diagnostics.enableInternalTrace | Boolean | false | Enable deep runtime tracing. | false |

Runtime Hot-Reload Controls

The runtime supports hot updates for these fields while active:

  • vad.sileroThreshold
  • vad.aggressiveness
  • confidence.defaultThreshold
  • confidence.emaRiseAlpha
  • confidence.emaDecayAlpha
  • confidence.cooldownSeconds
  • transcription.enableDebug

Other fields such as asr.modelPath, audio.sampleRate, and thread settings require a restart.

Phrase Controls

Used with listener.addPhrase() or profile JSON.

| Name | Type | Description | Example | | :--- | :--- | :--- | :--- | | phraseId | String | Unique identifier for the phrase. | "jarvis_core" | | phrase | String | Canonical wake phrase text. | "jarvis" | | variants | Array | Alternate ASR transcriptions. | ["jarvas", "jarbes"] | | threshold | Float | Per-phrase trigger sensitivity. | 0.72 | | cooldownMs | Integer | Phrase-specific cooldown in ms. | 2000 | | weight | Float | Relative scoring priority. | 1.5 | | enabled | Boolean | Enable or mute a phrase. | true |


🧩 Public API

AVA-Listener exposes these runtime controls through new AVAListener().

Lifecycle

  • start(profilePath?, opts?) β€” boot runtime, verify models, launch Python supervisor, and connect transport.
  • pause() β€” pause detection while keeping the runtime alive.
  • resume() β€” resume detection from READY or PAUSED.
  • stop() β€” gracefully shut down the runtime and supervisor.
  • restart() β€” stop and start again using the current profile.
  • destroy() β€” release resources and remove listeners.

Configuration & Profiles

  • loadProfile(profilePath) β€” load or reload a JSON profile at runtime.
  • validateProfile(profilePath) β€” validate a profile file and return {valid, errors, warnings}.
  • updateConfig(patch) β€” hot-patch supported runtime settings.
  • getEffectiveConfig() β€” fetch the current merged profile/config values.
  • updateRuntimeParameters(params) β€” alias for updateConfig().
  • resetParameters() β€” reset runtime-updatable values.

Phrase Management

  • addPhrase(phraseObj) β€” add a phrase to the active registry.
  • removePhrase(phraseId) β€” remove a phrase by ID.
  • enablePhrase(phraseId) β€” enable an existing phrase.
  • disablePhrase(phraseId) β€” disable an existing phrase.
  • updateVariants(phraseId, variants) β€” replace a phrase's variant list.
  • getPhrases() β€” request the active phrase registry.

Diagnostics

  • getState() β€” returns the current state machine state.
  • getHealth() β€” returns runtime health data.
  • getMetrics() β€” returns metrics from the runtime.
  • getDiagnostics() β€” returns diagnostic state information.
  • getManifest() β€” returns the runtime handshake manifest.
  • getCapabilities() β€” returns runtime capability flags.
  • enableExperimentMode() β€” enable experiment mode if supported.

Events

  • statechange β€” emitted for every state transition.
  • ready β€” emitted when the runtime reaches READY.
  • running β€” emitted when detection becomes active.
  • paused β€” emitted when detection is paused.
  • stopped β€” emitted when the runtime stops.
  • failed β€” emitted when startup or runtime failure occurs.
  • recovering / reconnected β€” emitted during reconnect recovery.
  • wake β€” emitted when ASR matching fires a wake event.

πŸ–ΌοΈ Example Gallery

Basic Usage

const { AVAListener } = require('ava-listener');

async function run() {
  const listener = new AVAListener();

  listener.on('wake', (event) => {
    console.log(`Wake detected: ${event.phrase} raw=${event.raw_confidence} smooth=${event.smooth_confidence}`);
  });

  await listener.start();
}
run();

Multiple Wake Phrases

listener.addPhrase({
  phraseId: 'hey_computer',
  phrase: 'hey computer',
  variants: ['hey computer', 'a computer'],
  threshold: 0.70
});

listener.addPhrase({
  phraseId: 'cancel_action',
  phrase: 'cancel',
  variants: ['cancel', 'stop', 'abort'],
  threshold: 0.85
});

ARVSAL Profile

{
  "assistantName": "Arvsal",
  "profileVersion": 1,
  "wakePhrases": [
    {
      "phraseId": "arvsal_core",
      "phrase": "arvsal",
      "variants": [
        "arvsal",
        "arsal",
        "arzal",
        "arsel",
        "armsel",
        "arv sal",
        "ar sal",
        "our whistle",
        "or whistle",
        "ourvsel",
        "aircel",
        "ahsal",
        "arv"
      ],
      "threshold": 0.72,
      "cooldownMs": 2000,
      "enabled": true
    },
    {
      "phraseId": "hey_arvsal",
      "phrase": "hey arvsal",
      "variants": [
        "hey arvsal",
        "hey arsal",
        "hey arsel",
        "hey armsel",
        "hey arzal",
        "hey ar sal",
        "he arbezal",
        "hey our whistle",
        "hey or whistle",
        "wake up our whistle",
        "wake upon whistle"
      ],
      "threshold": 0.68,
      "cooldownMs": 2000,
      "enabled": true
    },
    {
      "phraseId": "wake_up_arvsal",
      "phrase": "wake up arvsal",
      "variants": [
        "wake up arvsal",
        "wake up arsal",
        "wake up arsel",
        "wake up our whistle",
        "wake upon whistle",
        "wreak up arvsal",
        "wreak up arsel",
        "wreak up our whistle"
      ],
      "threshold": 0.68,
      "cooldownMs": 2000,
      "enabled": true
    }
  ],
  "extends": "base.json"
}

Jarvis Profile

{
  "assistantName": "Jarvis",
  "profileVersion": 1,
  "wakePhrases": [
    {
      "phraseId": "jarvis_core",
      "phrase": "jarvis",
      "variants": [
        "jarvis",
        "jarvas",
        "jarbes",
        "jarvus",
        "jarbus",
        "jar vis"
      ],
      "threshold": 0.72,
      "cooldownMs": 2000,
      "enabled": true
    },
    {
      "phraseId": "hey_jarvis",
      "phrase": "hey jarvis",
      "variants": [
        "hey jarvis",
        "hey jarvas",
        "hey jarbes",
        "hey jarvus",
        "hey jar vis"
      ],
      "threshold": 0.68,
      "cooldownMs": 2000,
      "enabled": true
    },
    {
      "phraseId": "ok_jarvis",
      "phrase": "ok jarvis",
      "variants": [
        "ok jarvis",
        "okay jarvis",
        "ok jarvas",
        "okay jar vis"
      ],
      "threshold": 0.68,
      "cooldownMs": 2000,
      "enabled": true
    }
  ],
  "extends": "base.json"
}

Debug Mode

listener.updateConfig({
  'transcription.enableDebug': true
});

Threshold Tuning

listener.updateConfig({
  'confidence.defaultThreshold': 0.82
});

Event Listeners

listener.on('statechange', ({ from, to }) => console.log(`State: ${from} -> ${to}`));
listener.on('ready', () => console.log('READY'));
listener.on('wake', (e) => console.log(`WAKE ${e.phrase}`));

Advanced Configuration

Use profile inheritance to preserve shared defaults while swapping wake phrases:

{
  "extends": "base.json",
  "assistantName": "Project Athena",
  "wakePhrases": [
    {
      "phraseId": "athena_core",
      "phrase": "project athena",
      "variants": ["project athena", "project athen a"],
      "threshold": 0.70,
      "cooldownMs": 2500,
      "enabled": true
    }
  ]
}

πŸ“Š Production Baseline

The following values come directly from:

benchmarks/baseline.md

Startup Metrics

| Metric | Value | | --------------------- | --------- | | Warm Start | 3770 ms | | Cold Start | 19138 ms | | Worker Spawn | 567.3 ms | | Worker Ready | 2652.3 ms | | Startup Success | 100% | | Worker Failures | 0 | | Websocket Disconnects | 0 |

Optimization Evidence

| Operation | Before | After | | ------------------ | ---------- | --------- | | Process Scan | 2981.75 ms | 0.014 ms | | Model Verification | 2792.29 ms | 11.234 ms |

Startup Improvement

5762.8 ms improvement

AVA-Listener aggressively optimizes startup behavior by:

  • avoiding repeated process scans
  • caching verified runtime state
  • reducing model validation overhead
  • minimizing worker initialization delays

⚑ Wake Detection Latency

AVA-Listener is designed for low-latency wake detection while remaining fully local and model-flexible.

Measured AVA Runtime

| Metric | Value | | ---------------------------------- | ------- | | Approximate wake detection latency | ~250 ms |

These measurements come from runtime benchmarking and startup validation results in the repository.

The latency represents the approximate time between finishing a wake phrase and the event being emitted to the application layer.


Latency Context

| System | Latency | | ------------------- | ------------------------ | | AVA-Listener | ~250 ms | | Picovoice Porcupine | Hardware dependent | | OpenWakeWord | Hardware/model dependent |

Porcupine and OpenWakeWord do not expose a single universal latency number because runtime performance varies substantially with:

  • CPU hardware
  • model selection
  • frame sizes
  • audio pipelines
  • runtime configuration

AVA-Listener prioritizes:

  • low latency
  • zero manual model generation
  • custom phrase flexibility
  • offline execution

Note: Unlike traditional wake-word engines that require retraining or generated models for uncommon words, AVA-Listener preserves low latency while allowing arbitrary phrase definitions through configurable variants.


πŸš€ Installation & Usage

NPM Package

npm install ava-listener
npx ava-listener setup

Clone & Build

git clone https://github.com/atharvpatil2748/ava-listener.git
cd ava-listener
npm install
npm run setup-models
npm run verify
npm start

🩺 Troubleshooting

Models not downloading Run npx ava-listener setup or npm run setup-models. AVA-Listener auto-generates models/manifests/manifest.json if it is missing.

Microphone unavailable Grant microphone permission to your terminal/Node process and confirm a valid input device is present.

False positives Increase threshold or confidence.defaultThreshold. Add more variants for mis-transcribed versions.

False negatives Enable transcription.enableDebug and watch the transcriptions. Add the observed output to variants.

Slow startup Use Node >= 18 and Python >= 3.10. Verify model downloads completed successfully.


❓ FAQ

Why not train a custom model? Training an acoustic model requires data, infrastructure, and tuning. AVA-Listener achieves custom wake detection through transcription matching, which is faster to configure and avoids dataset collection.

Can I use multiple phrases? Yes. You can register many phrases with their own thresholds, variants, and cooldowns.

Can I use non-English phrases? The shipped Sherpa-ONNX model is English-focused, but you can still add non-English transcriptions as variants if the ASR produces them consistently.

Can I run fully offline? Yes. The runtime itself is offline. Internet is required only for initial model setup (npx ava-listener setup).

Can I create my own profile? Yes. Create a JSON profile in profiles/ and load it with listener.loadProfile(path).


🀝 Community

AVA-Listener started as the listening and wake-word layer of ARVSAL, a personal AI assistant system. Over time, it has evolved into an independent, reusable package for custom voice activation.

We welcome contributions from the community, from research to production improvements.

Areas for Contribution

Runtime & Performance

  • Optimizing startup latency and memory footprint
  • Improving audio buffering and VAD algorithms
  • Threading and concurrency enhancements
  • Cross-platform testing (Linux, macOS, Windows ARM)

Matching & Detection

  • Better phrase matching algorithms
  • Confidence scoring improvements
  • Multi-language support and non-English ASR models
  • Advanced noise robustness

Profiles & Examples

  • Community-contributed profiles for popular assistants
  • Example integrations with smart home platforms
  • Benchmarking across different hardware and microphones
  • Accent and multilingual profile variants

Documentation & Testing

  • Architecture documentation and design decisions
  • Research experiments and academic papers
  • Tutorial videos and integration guides
  • Comprehensive test coverage

If you're interested in contributing, please open an issue or pull request. All contributions are appreciated.


πŸ’» Development

To develop or modify the engine, use the built-in NPM scripts:

  • npm run setup-models : Downloads the models.
  • npm run verify : Performs a layout structure and manifest validity check.
  • npm start : Runs examples/manual_sdk_test.js to immediately test microphone detection.
  • npm test : Executes the test runner.

πŸ—ΊοΈ Roadmap

  • [x] Isolate runtime from hardcoded logic.
  • [x] Implement dynamic JSON profile system.
  • [x] Release Node.js NPM wrapper.
  • [ ] Implement Rust-based audio capture backend to replace PyAudio dependencies.
  • [ ] Formalize Plugin API for overriding the phrase matcher.
  • [ ] WebUI configuration dashboard for tuning thresholds in real-time.

πŸ“„ License

Licensed under the MIT License. See LICENSE for details.


πŸ™ Acknowledgements

AVA-Listener stands on the shoulders of giants in the open-source speech community:

  • Sherpa-ONNX: Provides the incredibly fast streaming ASR backbone.
  • Silero VAD: Highly accurate, lightweight voice activity detection.
  • Picovoice & OpenWakeWord: For inspiring the deep need for accessible, local voice activation infrastructure.