npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

audio-forms

v0.1.0

Published

Fill forms with voice using Gemini Live API

Downloads

129

Readme

audio-forms

Fill forms with voice. Powered by Gemini Live API.

npm version license

A React component that lets users fill out forms by speaking. Drop in <AudioForm> around your inputs, run the included server, and your forms fill themselves from voice.


Features

  • Voice-to-form — Users speak naturally, fields fill automatically
  • Secure by design — API key stays on your server, never exposed to the browser
  • Zero config — Detects form fields from name attributes automatically
  • Double-check mode — Model confirms spelling of names, emails, and numbers before filling
  • Thinking levels — Adjustable reasoning depth for complex or ambiguous inputs
  • Works with any React app — No framework lock-in, just wrap your inputs

Quick Start

1. Install

npm install audio-forms

2. Start the server

// server.ts
import { createAudioFormsServer } from 'audio-forms/server';

createAudioFormsServer({
  apiKey: process.env.GEMINI_API_KEY,
  port: 3001,
});
npx tsx server.ts

3. Add to your React app

import { AudioForm } from 'audio-forms';

function ContactForm() {
  return (
    <AudioForm serverUrl="ws://localhost:3001">
      <input name="fullName" placeholder="Full Name" />
      <input name="email" placeholder="Email" type="email" />
      <input name="phone" placeholder="Phone" type="tel" />
    </AudioForm>
  );
}

Click the mic button. Speak. Watch the form fill itself.


How It Works

Browser                    Your Server                 Gemini Live API
┌──────────┐    WebSocket    ┌──────────┐    WebSocket    ┌──────────┐
│ AudioForm│ ──────────────> │  Proxy   │ ──────────────> │  Gemini  │
│ (mic)    │ <────────────── │ (apiKey) │ <────────────── │  (voice) │
└──────────┘    audio/tool   └──────────┘    audio/tool   └──────────┘
  1. User clicks mic — browser captures audio (16kHz PCM)
  2. Audio streams to your server via WebSocket
  3. Server proxies to Gemini Live API (API key added here)
  4. Model extracts field values via function calling
  5. Fields update in real-time, model speaks back confirmations

API

<AudioForm> Props

| Prop | Type | Default | Description | |------|------|---------|-------------| | serverUrl | string | required | WebSocket URL of your audio-forms server | | model | string | "gemini-3.1-flash-live-preview" | Gemini model to use | | doubleCheck | boolean | false | Spell back and confirm values before filling | | thinkingLevel | "none" \| "low" \| "medium" \| "high" | "none" | Model reasoning depth | | onFieldUpdate | (field, value) => void | — | Called when a field is filled | | onStatusChange | (status) => void | — | Status: idle, connecting, listening, processing, error | | onError | (error) => void | — | Called on errors |

createAudioFormsServer(config)

| Option | Type | Default | Description | |--------|------|---------|-------------| | apiKey | string | required | Your Gemini API key | | port | number | 3001 | WebSocket server port |


Examples

Basic

<AudioForm serverUrl="ws://localhost:3001">
  <input name="name" placeholder="Name" />
  <input name="email" placeholder="Email" />
</AudioForm>

Accurate mode (recommended for names and emails)

<AudioForm serverUrl="ws://localhost:3001" doubleCheck={true}>
  <input name="name" placeholder="Name" />
  <input name="email" placeholder="Email" />
  <input name="phone" placeholder="Phone" />
</AudioForm>

With thinking for complex forms

<AudioForm
  serverUrl="ws://localhost:3001"
  doubleCheck={true}
  thinkingLevel="medium"
>
  <input name="name" placeholder="Full Name" />
  <input name="email" placeholder="Email" />
  <input name="phone" placeholder="Phone" />
  <input name="address" placeholder="Address" />
  <input name="dob" placeholder="Date of Birth" />
  <textarea name="notes" placeholder="Notes" />
</AudioForm>

With event handlers

<AudioForm
  serverUrl="ws://localhost:3001"
  onFieldUpdate={(field, value) => {
    console.log(`Filled ${field}: ${value}`);
  }}
  onStatusChange={(status) => {
    console.log('Status:', status);
  }}
  onError={(err) => {
    console.error('Audio form error:', err);
  }}
>
  <input name="name" placeholder="Name" />
</AudioForm>

Field Detection

The component automatically detects fields by scanning children for <input>, <textarea>, and <select> elements with a name attribute.

| Attribute | Used for | |-----------|----------| | name | Field identifier (sent to model) | | placeholder | Human-readable label the model sees | | aria-label | Fallback label if no placeholder |


Security

Your Gemini API key never reaches the browser. The architecture:

  • Browser connects to your server via WebSocket
  • Your server connects to Gemini Live API with the API key
  • Messages are piped bidirectionally
  • The client has no knowledge of or access to the API key

Get Started with LLMs

Want to integrate audio-forms using an AI coding assistant? Copy the contents of llms.txt into your LLM's context. It contains a complete integration guide formatted for AI consumption.


Requirements

  • Node.js 18+
  • React 18+
  • A Gemini API key
  • Browser with microphone access (HTTPS in production)

Development

git clone https://github.com/voiceshop/audio-forms
cd audio-forms
npm install
npm run build        # Build the package
npm run example      # Run the demo (http://localhost:3000)

License

MIT