@myscheme/voice-form-filling

v0.1.11

Published

17 days ago

Voice-driven form filling demo using Azure Speech SDK and Amazon Bedrock.

0High
0Medium
0Low

Voice Form Filling Library

Voice-first experience that extracts HTML forms, guides users through each question with Azure Speech Services, and uses Amazon Bedrock to intelligently map responses back to the correct form fields. The code is written in TypeScript and ships as an embeddable library for web applications.

Features

Core Capabilities

Speech Recognition & Synthesis: Speech-to-text and text-to-speech via Azure Cognitive Services Speech SDK with optimized response times
Intelligent Form Extraction: Automatic extraction of standard HTML form fields, including validation metadata and select/radio/checkbox options, with support for custom dropdown components (ng-select, searchable-dropdown)
AI-Powered Response Routing: Uses Amazon Bedrock (Claude) to intelligently map free-form user answers to the correct fields with phonetic matching and text normalization
Multi-Language Support: Full support for English and Hindi with configurable voice selection
Smart Validation: Constraint-aware field assignment with audible retry prompts if user input violates form rules
Multi-Step Navigation: Automatic tab/section detection and navigation for complex multi-step forms

Advanced Features

Voice Commands

Reset Command: Users can say "reset" or "reset the form" to clear all entered data. System asks for confirmation before resetting
Quit/Finish Commands: Users can say "quit", "finish", or "stop" to exit the voice flow. System asks for confirmation and properly releases microphone access
Phonetic Matching: Handles common homophones and phonetic variations (e.g., "General" vs "Journal", "Sikh" vs "Sick")

Smart Field Handling

All Options Announced: For dropdown fields, all available options are read aloud (not just the first 5)
Manual Entry Detection: If a user manually types a value in a field while the assistant is speaking, the field is automatically skipped
Multi-Field Extraction: Users can provide answers to multiple fields in a single response (e.g., "I am Hindu, income is 50000, and I am unmarried")

Text Normalization & Accuracy

Punctuation Cleanup: Trailing punctuation is automatically removed from numeric inputs (e.g., "1000." becomes "1000")
Special Character Preservation: Important special characters in addresses are preserved (e.g., "4/23" remains "4/23")
70% Similarity Matching: Dropdown options are matched using intelligent string similarity (70%+ threshold) for better accuracy under noisy speech conditions
Input Pause Support: User natural speaking pauses (3-4 seconds) are handled properly without premature timeout

Performance Optimizations

Fast Response Time: Reduced silence detection timeout to 2.5 seconds for 50% faster responses after user stops speaking
Instant UI Updates: Special command confirmations and new questions appear immediately in the UI without delays
Efficient Dropdown Handling: Removed unnecessary readonly checks for dropdown fields, allowing community/religion selections to work properly

User Experience Enhancements

Clear Status Messages: Updated user-facing messages for better clarity:
- "Voice assistant is ready. You can start speaking." (on start)
- "Voice assistant stopped. You can restart anytime." (on stop)
- "All fields completed successfully!" (on completion)
- "Moving to {field}" (when advancing to next field)
Transcript Management: Previous user responses are automatically cleared when the system asks a new question
Proper Microphone Cleanup: Microphone access is explicitly released when user quits or finishes the form

Project Structure

voice-form-filling/
├─ src/                 # Library source (compiled with `tsc`)
│  ├─ index.ts          # Public initializeVoiceForm() entry point
│  ├─ VoiceFormService.ts  # Main service (3800+ lines) with form flow logic
│  ├─ bedrockRouter.ts  # Bedrock SDK integration with enhanced LLM prompts
│  ├─ formExtractor.ts  # DOM field parsing helpers with custom dropdown support
│  └─ types.ts          # Shared type definitions
├─ tsconfig.json        # Library TypeScript config (emits to dist/)
├─ tsconfig.app.json    # TypeScript config for build tooling
├─ vite.config.ts       # Vite dev/build configuration
└─ package.json

Quick Start

1. Install the Library

npm install @myscheme/voice-form-filling
# or
npm install /path/to/myscheme-voice-form-filling-0.1.2.tgz

2. Initialize Voice Form

import { initializeVoiceForm } from "@myscheme/voice-form-filling";

const controller = await initializeVoiceForm({
  formSelector: "#my-form",
  azureSpeech: {
    subscriptionKey: "YOUR_AZURE_KEY",
    region: "eastus",
  },
  bedrock: {
    modelId: "anthropic.claude-3-haiku-20240307",
    // Option 1: Provide AWS credentials (not recommended for production)
    client: new BedrockRuntimeClient({
      region: "us-east-1",
      credentials: {
        /* ... */
      },
    }),
    // Option 2: Provide a proxy router (recommended for production)
    // router: new ProxyBedrockRouter('https://your-api.com/bedrock')
  },
  uiHooks: {
    onPrompt: (data) => console.log("Question:", data.text),
    onTranscript: (data) => console.log("User said:", data.text),
    onResetRequested: () => {
      // IMPORTANT: Implement your form reset logic here
      document.querySelector("#my-form").reset();
    },
  },
});

// Start the voice flow
await controller.start();

3. Implement Reset Hook (Important!)

The library delegates form reset to your application. You must implement onResetRequested:

// Component class
class MyComponent {
  onResetVoice() {
    this.myForm.reset();
    // Restore defaults if needed
    this.myForm.patchValue({ state: this.defaultState });
  }
}

// When initializing
uiHooks: {
  onResetRequested: () => this.component.onResetVoice();
}

4. Handle UI Updates

uiHooks: {
  onSpeechSynthesis: (data) => {
    // Show what assistant is saying
    document.getElementById('assistant-text').textContent = data.text;
  },
  onTranscript: (data) => {
    // Show what user is saying
    document.getElementById('user-text').textContent = data.text;
    if (data.isFinal) {
      // Transcript is final, process complete
    }
  },
  onAssignment: (data) => {
    // Highlight filled field
    document.querySelector(`#${data.fieldId}`)?.classList.add('filled');
  }
}

Usage Guide

Voice Command Reference

| Command | Description | Example Phrases | | ---------------------- | --------------------------------- | ---------------------------------------------------------- | | Answer Fields | Provide answers to form questions | "My name is John", "I am 25 years old", "General category" | | Multi-Field Answer | Answer multiple fields at once | "I am Hindu, income is 50000, unmarried" | | Reset Form | Clear all entered data | "reset", "reset the form", "start over" | | Quit Voice Flow | Exit voice assistant | "quit", "stop", "finish", "I want to quit" | | Confirmation | Respond to yes/no questions | "yes", "yeah", "no", "nope", "continue", "stop" |

How Voice Interaction Works

Start the Flow: Click "Start Voice Input" button
Listen for Questions: The assistant will ask about each required field
Speak Your Answer: After hearing the question, speak your response clearly
Silent Pause Detection: After you stop speaking for 2.5 seconds, the system processes your answer
Automatic Validation: Your answer is validated against field constraints
Move to Next Field: Upon successful validation, the assistant moves to the next field
Completion: After all fields are filled, you can review and edit any field

Special Behaviors

Dropdown Fields

All Options Read: The assistant reads ALL available options for dropdown fields, not just a subset
Flexible Matching: You don't need to say the exact option label; approximate matches work (70%+ similarity)
Example: For Community field with options "GENERAL, OBC, SC, ST, ST-PVGT", you can say "general", "OBC category", or "scheduled caste"

Manual Field Editing

Skip Voice Input: If you manually type/select a value while the assistant is asking, that field is automatically skipped
Concurrent Editing: You can fill multiple fields manually while voice flow continues

Reset Behavior

Voice Confirmation: When you say "reset", the system asks via voice: "Are you sure you want to reset the entire form?"
No Visual Dialog: Voice reset uses voice confirmation only (no popup dialogs)
Client Delegation: The library doesn't reset your form directly - it calls your onResetRequested hook after confirmation
Smart Implementation: You should implement form reset logic in your component and pass it to the library
Flow Restart: After successful reset, the voice flow automatically restarts from the first field/tab
UI Updates: The assistant announces "Form has been reset. All fields have been cleared. Let's start from the beginning."

Quit/Finish Behavior

Confirmation Required: When you say "quit" or "finish", the system asks "Are you sure you want to quit?"
Microphone Release: Upon confirmation, the microphone access is explicitly released
Resume Capability: You can restart voice flow anytime by clicking "Start Voice Input" again

Performance Characteristics

Initial Silence Timeout: 8 seconds (time allowed before first word)
Inter-Word Silence Timeout: 2.5 seconds (time between words before processing)
Dropdown Loading: Up to 4 seconds wait for custom dropdown options to load
Response Processing: Typically < 1 second for Bedrock LLM to route answers

Prerequisites

Node.js 18+
Azure Speech resource with a subscription key and region
Amazon Bedrock access with an identity that can invoke Claude 3 models (Haiku, Sonnet, or Opus)
Modern browser with microphone support (Chrome, Edge, Safari, Firefox)

Basic Usage

import { initializeVoiceForm, AwsBedrockRouter } from "voice-form-filling";
import { BedrockRuntimeClient } from "@aws-sdk/client-bedrock-runtime";

const controller = await initializeVoiceForm({
  formSelector: "#checkout-form",
  azureSpeech: {
    subscriptionKey: process.env.AZURE_SPEECH_KEY!,
    region: process.env.AZURE_SPEECH_REGION!,
  },
  bedrock: {
    modelId: "anthropic.claude-3-haiku-20240307",
    router: new AwsBedrockRouter({
      client: new BedrockRuntimeClient({ region: "us-east-1" }),
      modelId: "anthropic.claude-3-haiku-20240307",
    }),
  },
});

await controller.start();

Use the optional uiHooks callbacks to surface transcripts or status updates, and call controller.stop() when the user leaves the flow.

Implementing Reset Functionality

The library provides voice-based reset command ("reset", "reset the form") with voice confirmation, but you must implement the actual form reset logic in your application using the onResetRequested hook.

Architecture

Form Button Reset: Shows visual dialog, then resets form (your existing implementation)
Voice Reset: Uses voice confirmation, then calls your reset callback (no dialog)

Implementation Example

// 1. Create separate reset methods in your component
class MyFormComponent {
  // For form button clicks - with visual dialog
  onReset() {
    this.matDialog
      .open(ConfirmationDialogComponent, { data: "Are you sure?" })
      .afterClosed()
      .subscribe((confirmed) => {
        if (confirmed) {
          this.myForm.reset();
          // Additional cleanup...
        }
      });
  }

  // For voice commands - no dialog (voice handles confirmation)
  onResetVoice() {
    this.myForm.reset();
    // Additional cleanup...
    // Restore any default values if needed
  }
}

// 2. Pass the voice reset callback to the library
const controller = await initializeVoiceForm({
  formSelector: "#my-form",
  // ... other config
  uiHooks: {
    onResetRequested: () => {
      // Call your form's reset method
      myFormComponent.onResetVoice();
    },
    onQuit: (payload) => {
      if (payload.confirmed) {
        console.log("User confirmed quit");
        // Navigate away, cleanup, etc.
      } else {
        console.log("User cancelled quit");
      }
    },
  },
});

Angular Integration Example

// voice-assistant.helper.ts
export class VoiceAssistantHelper {
  constructor(
    private zone: NgZone,
    private cdr: ChangeDetectorRef,
    private formSelector: string,
    private autoStart: boolean = false,
    private componentResetCallback?: () => Promise<void> | void,
  ) {}

  private buildUiHooks(): UiHooks {
    return {
      // ... other hooks
      onResetRequested: async () => {
        if (this.componentResetCallback) {
          const result = this.componentResetCallback();
          if (result && typeof result.then === "function") {
            await result;
          }
        }
      },
    };
  }
}

// your-component.ts
export class YourComponent {
  voiceAssistant!: VoiceAssistantHelper;

  ngOnInit() {
    // Pass the reset callback when initializing
    this.voiceAssistant = new VoiceAssistantHelper(
      this.zone,
      this.cdr,
      "#your-form",
      false,
      () => this.onResetVoice(), // Call voice-specific reset (no dialog)
    );
  }

  // For button clicks - with dialog
  onReset() {
    this.matDialog
      .open(ConfirmationDialogComponent)
      .afterClosed()
      .subscribe((confirmed) => {
        if (confirmed) {
          this.performReset();
        }
      });
  }

  // For voice commands - no dialog
  onResetVoice() {
    this.performReset();
  }

  private performReset() {
    this.myForm.reset();
    // Restore default values
    this.myForm.get("stateField")?.setValue(this.defaultState);
    // Clear dependent dropdowns
    this.clearDependentDropdowns();
  }
}

Reset Flow

Button Click Reset:

User clicks "Reset" button
Visual dialog appears: "Are you sure?"
User clicks "Yes" or "No"
If Yes → form resets

Voice Reset:

User says "reset"
Voice asks: "Are you sure you want to reset the entire form? Say yes or no"
User says "yes" or "no"
If Yes → onResetRequested() hook called → form resets → voice announces completion → flow restarts
If No → voice announces cancellation → continues with current field

Best Practices

Separate Methods: Keep onReset() (with dialog) and onResetVoice() (no dialog) separate
Extract Common Logic: Put actual reset logic in a shared private method
Preserve Defaults: Some fields (like State, Gender) should be restored to defaults after reset
Clear Dependencies: Clear dependent dropdown lists when parent field is reset
Async Support: onResetRequested supports both sync and async callbacks

Configuration Options

interface VoiceFormInitOptions {
  formSelector: string; // CSS selector for the form element
  azureSpeech: {
    subscriptionKey: string; // Azure Speech Service key
    region: string; // Azure region (e.g., "eastus")
  };
  bedrock: {
    modelId?: string; // Bedrock model ID (default: Claude 3 Haiku)
    router?: BedrockRouter; // Custom router implementation
    client?: BedrockRuntimeClient; // AWS Bedrock client
  };
  language?: "en-US" | "hi-IN"; // Initial language (default: "en-US")
  defaultLanguage?: "en-US" | "hi-IN"; // Fallback language
  uiHooks?: {
    onPrompt?: (payload: { text: string; fieldId?: string }) => void;
    onSpeechSynthesis?: (payload: { text: string }) => void;
    onTranscript?: (payload: { text: string; isFinal?: boolean }) => void;
    onAssignment?: (payload: {
      fieldId: string;
      value: string | string[];
    }) => void;
    onStatus?: (payload: { message: string }) => void;
    onError?: (payload: { error: Error }) => void;
    onFieldsExtracted?: (payload: {
      fields: VoiceFormFieldDescriptor[];
    }) => void;
    onStepChange?: (payload: {
      stepId: string;
      label: string;
      index: number;
      fieldIds: string[];
    }) => void;
    onQuit?: (payload: { confirmed: boolean }) => void;
    onResetRequested?: () => Promise<void> | void;
  };
  debug?: {
    logExtractedFields?: boolean; // Log field extraction details to console
  };
}

UI Hooks Explanation

onPrompt: Fired when assistant asks a question. Display this as the main prompt in your UI
onSpeechSynthesis: Fired when assistant is about to speak (TTS). Shows what the assistant will say
onTranscript: Fired when user speaks (STT). Display real-time transcription with isFinal: false, final text with isFinal: true
onAssignment: Fired when a field value is successfully set. Use to highlight filled fields
onStatus: Fired for status updates like "Voice assistant started", "Skipped {field}", "Moving to {field}"
onError: Fired when errors occur (e.g., microphone access denied, API failures)
onFieldsExtracted: Fired once after form fields are extracted, useful for debugging
onStepChange: Fired when moving between form steps/tabs
onQuit: Fired when user says "quit" or "finish". Payload indicates if quit was confirmed or cancelled
onResetRequested: [IMPORTANT] Fired when user says "reset" and confirms. You must implement this hook to actually reset your form. The library handles voice confirmation but delegates the actual reset action to your application

Common Integration Patterns

Pattern 1: Simple Vanilla JS Form

import { initializeVoiceForm } from "@myscheme/voice-form-filling";

const formElement = document.querySelector("#checkout-form");
let controller;

document.getElementById("start-voice").addEventListener("click", async () => {
  if (!controller) {
    controller = await initializeVoiceForm({
      formSelector: "#checkout-form",
      azureSpeech: {
        /* credentials */
      },
      bedrock: {
        /* credentials */
      },
      uiHooks: {
        onResetRequested: () => {
          formElement.reset();
        },
        onPrompt: (data) => {
          document.getElementById("question").textContent = data.text;
        },
        onTranscript: (data) => {
          document.getElementById("transcript").textContent = data.text;
        },
      },
    });
  }
  await controller.start();
});

document.getElementById("stop-voice").addEventListener("click", () => {
  if (controller) {
    controller.stop();
  }
});

Pattern 2: React Integration

import { useEffect, useRef, useState } from 'react';
import { initializeVoiceForm, VoiceFormController } from '@myscheme/voice-form-filling';

function MyForm() {
  const [prompt, setPrompt] = useState('');
  const [transcript, setTranscript] = useState('');
  const [isActive, setIsActive] = useState(false);
  const controllerRef = useRef<VoiceFormController | null>(null);
  const formRef = useRef<HTMLFormElement>(null);

  useEffect(() => {
    // Initialize on mount
    initializeVoiceForm({
      formSelector: '#my-form',
      azureSpeech: { /* credentials */ },
      bedrock: { /* credentials */ },
      uiHooks: {
        onResetRequested: () => {
          formRef.current?.reset();
        },
        onPrompt: (data) => setPrompt(data.text),
        onTranscript: (data) => setTranscript(data.text),
        onStatus: (data) => console.log(data.message)
      }
    }).then(controller => {
      controllerRef.current = controller;
    });

    // Cleanup on unmount
    return () => {
      if (controllerRef.current) {
        controllerRef.current.stop();
      }
    };
  }, []);

  const toggleVoice = async () => {
    if (!controllerRef.current) return;

    if (isActive) {
      await controllerRef.current.stop();
      setIsActive(false);
    } else {
      await controllerRef.current.start();
      setIsActive(true);
    }
  };

  return (
    <div>
      <button onClick={toggleVoice}>
        {isActive ? 'Stop Voice' : 'Start Voice'}
      </button>
      <div className="voice-ui">
        <p>Assistant: {prompt}</p>
        <p>You: {transcript}</p>
      </div>
      <form ref={formRef} id="my-form">
        {/* form fields */}
      </form>
    </div>
  );
}

Pattern 3: Angular Service with Helper Class

// voice-assistant.helper.ts
import { NgZone, ChangeDetectorRef } from "@angular/core";
import {
  VoiceFormController,
  initializeVoiceForm,
} from "@myscheme/voice-form-filling";

export interface VoiceAssistantState {
  isActive: boolean;
  prompt: string;
  transcript: string;
  statusMessage: string;
}

export class VoiceAssistantHelper {
  private controller: VoiceFormController | null = null;
  public state: VoiceAssistantState = {
    isActive: false,
    prompt: "",
    transcript: "",
    statusMessage: "",
  };

  constructor(
    private zone: NgZone,
    private cdr: ChangeDetectorRef,
    private formSelector: string,
    private resetCallback?: () => void,
  ) {}

  async initialize(azureConfig: any, bedrockConfig: any): Promise<void> {
    this.controller = await initializeVoiceForm({
      formSelector: this.formSelector,
      azureSpeech: azureConfig,
      bedrock: bedrockConfig,
      uiHooks: {
        onResetRequested: () => {
          if (this.resetCallback) {
            this.zone.run(() => this.resetCallback!());
          }
        },
        onPrompt: (data) => {
          this.zone.run(() => {
            this.state.prompt = data.text;
            this.cdr.markForCheck();
          });
        },
        onTranscript: (data) => {
          this.zone.run(() => {
            this.state.transcript = data.text;
            this.cdr.markForCheck();
          });
        },
        onStatus: (data) => {
          this.zone.run(() => {
            this.state.statusMessage = data.message;
            this.cdr.markForCheck();
          });
        },
      },
    });
  }

  async start(): Promise<void> {
    if (this.controller) {
      await this.controller.start();
      this.state.isActive = true;
    }
  }

  async stop(): Promise<void> {
    if (this.controller) {
      await this.controller.stop();
      this.state.isActive = false;
    }
  }

  destroy(): void {
    if (this.controller) {
      this.controller.stop();
      this.controller = null;
    }
  }
}

// component.ts
export class MyFormComponent implements OnInit, OnDestroy {
  voiceAssistant!: VoiceAssistantHelper;

  constructor(
    private zone: NgZone,
    private cdr: ChangeDetectorRef,
  ) {}

  async ngOnInit() {
    this.voiceAssistant = new VoiceAssistantHelper(
      this.zone,
      this.cdr,
      "#my-form",
      () => this.onResetVoice(),
    );

    await this.voiceAssistant.initialize(
      { subscriptionKey: "...", region: "..." },
      { modelId: "..." /* ... */ },
    );
  }

  onResetVoice() {
    this.myForm.reset();
    // Additional cleanup
  }

  toggleVoice() {
    if (this.voiceAssistant.state.isActive) {
      this.voiceAssistant.stop();
    } else {
      this.voiceAssistant.start();
    }
  }

  ngOnDestroy() {
    this.voiceAssistant.destroy();
  }
}

Pattern 4: Backend Proxy for Bedrock (Recommended for Production)

// Backend: Express.js proxy endpoint
app.post("/api/bedrock-proxy", authenticate, async (req, res) => {
  try {
    const { userInput, pendingFields, completedFields, language } = req.body;

    const bedrockClient = new BedrockRuntimeClient({
      region: process.env.AWS_REGION,
      credentials: {
        accessKeyId: process.env.AWS_ACCESS_KEY_ID,
        secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY,
      },
    });

    const router = new AwsBedrockRouter({
      client: bedrockClient,
      modelId: "anthropic.claude-3-haiku-20240307",
    });

    const response = await router.routeAnswer({
      userInput,
      pendingFields,
      completedFields,
      language,
    });

    res.json(response);
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

// Frontend: Custom proxy router
class ProxyBedrockRouter implements BedrockRouter {
  constructor(private endpoint: string) {}

  async routeAnswer(
    input: BedrockRoutingRequest,
  ): Promise<BedrockRoutingResponse> {
    const response = await fetch(this.endpoint, {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify(input),
      credentials: "include",
    });

    if (!response.ok) {
      throw new Error(`Proxy error: ${response.status}`);
    }

    return await response.json();
  }
}

// Use in client
const controller = await initializeVoiceForm({
  formSelector: "#my-form",
  azureSpeech: {
    /* ... */
  },
  bedrock: {
    modelId: "anthropic.claude-3-haiku-20240307",
    router: new ProxyBedrockRouter("/api/bedrock-proxy"),
  },
  uiHooks: {
    /* ... */
  },
});

Troubleshooting

Common Issues

"Microphone not working"

Cause: Browser denied microphone permission or page is not HTTPS
Solution:
- Check browser permissions (usually icon in address bar)
- Ensure page is loaded via HTTPS or localhost
- On mobile devices, page must use HTTPS for microphone access

"Voice assistant not clearing dropdown"

Cause: Custom dropdown component not recognized
Solution: The library supports ng-select and most custom dropdowns. Implement proper reset logic in your onResetVoice() method to clear dropdown values.

"Reset not working via voice"

Cause: onResetRequested hook not implemented
Solution: You must implement the onResetRequested hook to actually reset your form. See "Implementing Reset Functionality" section above.
```
uiHooks: {
  onResetRequested: () => {
    myForm.reset();
    // Additional cleanup
  };
}
```

"Reset shows two dialogs"

Cause: Both form button and voice assistant showing confirmation dialogs
Solution: Create separate methods: onReset() with dialog for button, onResetVoice() without dialog for voice. Pass onResetVoice to the voice assistant.

"Changes not appearing after rebuild"

Cause: Angular build cache serving old files

Solution:

rm -rf .angular/cache node_modules/@myscheme/voice-form-filling
npm install ../voice-form-filling/myscheme-voice-form-filling-0.1.2.tgz --legacy-peer-deps

"Field marked as disabled but user filled it"

Cause: Field has readonly attribute but is not truly disabled
Solution: Library now only checks disabled attribute for actual disabled state, not readonly

"Assistant taking too long to respond"

Current Behavior: 2.5 second silence timeout after user stops speaking
Adjustment: Modify SILENCE_TIMEOUT_MS in VoiceFormService.ts if needed

"Dropdown values not matching"

Cause: Strict matching failing on noisy speech input
Solution: Library uses 70% similarity matching. Check that dropdown options are loaded (console should show option count)

"Special command not recognized"

Cause: LLM confidence below threshold (70%)
Solution: Speak clearly: "I want to reset", "I want to quit", "please reset the form"

Best Practices

Speak Clearly: Speak at a normal pace with clear pronunciation
Natural Pauses: Pause for 3-4 seconds between thoughts is okay
Exact Matches: For critical fields (like PIN codes), speak slowly and clearly
Multi-Field Answers: You can provide multiple answers at once: "I am Hindu, income 50000, unmarried"
Manual Override: Feel free to manually type values - voice flow will skip those fields
Tab Navigation: Let the voice assistant guide you through tabs/sections automatically
Review Before Submit: After all fields are complete, review your entries before final submission

Development

Building the Library

# Type check
npm run typecheck

# Build
npm run build

# Package
npm pack

Compiled artifacts land in dist/, mirroring the structure in src/.

Installing in Angular Project

# From your Angular project directory
npm install ../voice-form-filling/myscheme-voice-form-filling-0.1.2.tgz --legacy-peer-deps

# Clear Angular cache if changes don't appear
rm -rf .angular/cache

Key Implementation Files

VoiceFormService.ts (4300+ lines): Core service managing voice flow, field extraction, speech recognition/synthesis
- buildQuestion(): Generates questions for each field, reads all dropdown options
- collectAnswerForField(): Manages question → listen → validate → apply cycle
- confirmReset(): Voice confirmation for reset command (mirrors confirmQuit pattern)
- confirmQuit(): Voice confirmation for quit/finish commands
- speak(): TTS with optimized emit order for instant UI updates
- listen(): STT with silence detection and transcript buffering
- runFormFlow(): Main loop handling multi-step form navigation
bedrockRouter.ts (1100+ lines): LLM integration with comprehensive prompt engineering
- System prompt with special command detection
- Phonetic matching rules (General/Journal, Sikh/Sick)
- Text normalization rules (punctuation cleanup, special char preservation)
- 11 examples covering various input scenarios
formExtractor.ts: DOM parsing for standard HTML and custom dropdown components
- Support for ng-select, searchable-dropdown, div[role=listbox]
- Automatic option extraction from dropdown components
- Tab/section detection for multi-step forms

Amazon Bedrock Notes

Security: Browsers should not store long-lived AWS secrets. For production, expose a secure backend endpoint that proxies Bedrock requests and pass it into the library as a custom BedrockRouter implementation.
Model Compatibility: The included AwsBedrockRouter helper formats prompts for Anthropic Claude 3 models (Haiku, Sonnet, Opus). Adapt the prompt builder if you prefer other providers.
LLM Capabilities: The system prompt includes:
- Special command detection (reset, quit, finish) with confidence scoring
- Phonetic/homophone handling for common misrecognitions
- Text normalization rules for addresses, numbers, punctuation
- 70%+ similarity matching algorithm for dropdown options
- 11 comprehensive examples covering various user input patterns

Recent Improvements (v0.1.2)

Architecture Refactoring (Latest)

Reset Delegation: Completely removed form reset logic from library (~530 lines). Reset is now delegated to client applications via onResetRequested hook
Hook-Based Design: Library provides voice confirmation; client provides reset implementation
Separation of Concerns: Library is now framework-agnostic for reset - works with Angular, React, Vue, or vanilla JS
Two Reset Paths: Form button uses visual dialog, voice uses voice confirmation (no duplicate dialogs)

UI Synchronization

Fixed special command messages appearing instantly in UI (quit, reset, finish confirmations)
Previous user transcripts now clear immediately when new questions are asked
Emit order optimized: speech text → clear transcript → async operations
Reset confirmation prompt now displays correctly in UI during voice interaction

Performance Enhancements

Silence timeout reduced from 5s to 2.5s (50% faster response time)
Removed unnecessary readonly checks for dropdown fields
Optimized dropdown option loading with retry mechanisms

Accuracy Improvements

All dropdown options now announced (previously only first 5)
Phonetic matching for common homophones (General/Journal, Sikh/Sick)
Text normalization: strip punctuation from numbers, preserve special chars in addresses
70%+ similarity matching for dropdown options
Multi-field extraction: handle multiple answers in single response

Smart Field Management

Manual entry detection: skip voice input if user manually fills field
Reset command with voice confirmation and client-side delegation
Quit/Finish commands with proper microphone cleanup
Tab/step navigation with automatic synchronization
Field state tracking across multi-step forms

User Experience

Updated status messages for clarity
Instant UI updates for all interactions
Proper transcript clearing and prompt management
Support for natural speaking pauses (3-4 seconds)

License

This library is provided as-is for integration into web applications.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Voice Form Filling Library

Features

Core Capabilities

Advanced Features

Voice Commands

Smart Field Handling

Text Normalization & Accuracy

Performance Optimizations

User Experience Enhancements

Project Structure

Quick Start

1. Install the Library

2. Initialize Voice Form

3. Implement Reset Hook (Important!)

4. Handle UI Updates

Usage Guide

Voice Command Reference

How Voice Interaction Works

Special Behaviors

Dropdown Fields

Manual Field Editing

Reset Behavior

Quit/Finish Behavior

Performance Characteristics

Prerequisites

Basic Usage

Implementing Reset Functionality

Architecture

Implementation Example

Angular Integration Example

Reset Flow

Best Practices

Configuration Options

UI Hooks Explanation

Common Integration Patterns

Pattern 1: Simple Vanilla JS Form

Pattern 2: React Integration

Pattern 3: Angular Service with Helper Class

Pattern 4: Backend Proxy for Bedrock (Recommended for Production)

Troubleshooting

Common Issues

"Microphone not working"

"Voice assistant not clearing dropdown"

"Reset not working via voice"

"Reset shows two dialogs"

"Changes not appearing after rebuild"

"Field marked as disabled but user filled it"

"Assistant taking too long to respond"

"Dropdown values not matching"

"Special command not recognized"

Best Practices

Development

Building the Library

Installing in Angular Project

Key Implementation Files

Amazon Bedrock Notes

Recent Improvements (v0.1.2)

Architecture Refactoring (Latest)

UI Synchronization

Performance Enhancements

Accuracy Improvements

Smart Field Management

User Experience

License