@thinkata/voice-kit

v0.0.4

Published

5 months ago

Complete voice-powered application toolkit - Speech-to-text, AI form filling, and serverless API handlers

Voice Kit

Complete voice-powered application toolkit - Speech-to-text, AI form filling, and serverless API handlers all in one package.

Version: 0.0.4

Overview

Voice Kit is an all-in-one solution for building voice-powered web applications. It combines:

Core - Framework-agnostic speech-to-text primitives
Forms - Vue 3 composables for voice-powered form filling
Server - Nitro/h3 API handlers for serverless deployments
Components - Ready-to-use Vue components for voice input

All in a single, easy-to-install package!

⚠️ Important Notice

This version of Voice Kit is NOT suitable for PII (Personally Identifiable Information) workflows. The example applications and default configurations are designed for non-sensitive use cases such as product feedback, surveys, and general data collection. For applications handling PII, additional security measures, data encryption, and compliance considerations are required.

Features

🎤 Multiple Speech Providers - ElevenLabs, OpenAI, Together AI
🤖 AI-Powered Form Filling - Intelligent form parsing with LLM integration
⚡ Serverless Ready - Works with Nuxt, Nitro, Cloudflare Workers, Vercel
🔒 Built-in Rate Limiting - Protect your API endpoints
📱 Vue 3 Composables - Reactive hooks for easy integration
🎨 Ready-to-Use Components - Beautiful voice input UI components
🎯 TypeScript First - Full type safety out of the box
🌐 Framework Agnostic Core - Use anywhere JavaScript runs

Installation

npm install @thinkata/voice-kit

Quick Start

Option 1: Use Pre-built Components (Easiest)

<script setup>
import VoiceInput from '@thinkata/voice-kit/components/VoiceInput.vue'
import { reactive } from 'vue'

const formData = reactive({
  productName: '',
  rating: '',
  feedback: ''
})

const handleTranscript = async (transcript) => {
  // Call your parse API
  const response = await fetch('/api/parse-speech', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ 
      text: transcript,
      formStructure: { fields: ['productName', 'rating', 'feedback'] }
    })
  })
  
  const result = await response.json()
  Object.assign(formData, result.data)
}
</script>

<template>
  <div>
    <VoiceInput 
      @transcript="handleTranscript" 
      apiEndpoint="/api/speech-to-text"
    />
    
    <form>
      <input v-model="formData.productName" placeholder="Product Name" />
      <select v-model="formData.rating">
        <option value="">Select Rating</option>
        <option value="5">5 - Excellent</option>
        <option value="4">4 - Very Good</option>
        <option value="3">3 - Good</option>
        <option value="2">2 - Fair</option>
        <option value="1">1 - Poor</option>
      </select>
      <textarea v-model="formData.feedback" placeholder="Your feedback..." />
      <button>Submit</button>
    </form>
  </div>
</template>

Option 2: Use Composables (More Control)

<template>
  <div>
    <button @click="toggleRecording">
      {{ isRecording ? 'Stop' : 'Start' }} Recording
    </button>
    <p v-if="transcript">{{ transcript }}</p>
    <p v-if="error">{{ error }}</p>
  </div>
</template>

<script setup>
import { useSimpleVoiceKit } from '@thinkata/voice-kit'

const {
  isRecording,
  transcript,
  error,
  toggleRecording
} = useSimpleVoiceKit({
  apiEndpoint: '/api/speech-to-text'
})
</script>

Server-Side Setup (Required)

Create API endpoints in your Nuxt/Nitro app:

`/server/api/speech-to-text.post.ts`

import { createSpeechToTextHandler } from '@thinkata/voice-kit'

export default createSpeechToTextHandler({
  elevenLabsApiKey: process.env.ELEVENLABS_API_KEY,
  togetherApiKey: process.env.TOGETHER_API_KEY,
  openaiApiKey: process.env.OPENAI_API_KEY,
  speechProvider: (process.env.SPEECH_PROVIDER || 'elevenlabs') as any
})

`/server/api/parse-speech.post.ts` (Optional, for form filling)

import { createParseSpeechHandler } from '@thinkata/voice-kit'

export default createParseSpeechHandler({
  openaiApiKey: process.env.OPENAI_API_KEY,
  anthropicApiKey: process.env.ANTHROPIC_API_KEY,
  togetherApiKey: process.env.TOGETHER_API_KEY,
  defaultProvider: (process.env.LLM_PROVIDER || 'together') as any
})

Environment Variables

Create .env file:

# Speech-to-Text Provider (choose one)
ELEVENLABS_API_KEY=your_elevenlabs_key
SPEECH_PROVIDER=elevenlabs

# LLM Provider for form parsing (optional)
TOGETHER_API_KEY=your_together_key
LLM_PROVIDER=together

API Reference

Vue Components

See COMPONENTS.md for detailed component documentation.

`VoiceInput.vue`

Ready-to-use voice input component with recording UI.

Import:

import VoiceInput from '@thinkata/voice-kit/components/VoiceInput.vue'

Props:

apiEndpoint?: string - Speech-to-text API endpoint (default: /api/speech-to-text)

Events:

@transcript(text: string) - Emitted when speech is transcribed
@error(error: string) - Emitted on error

Client-Side Composables

`useSimpleVoiceKit(options)`

Basic voice recording and transcription.

Options:

apiEndpoint?: string - Server endpoint (default: '/api/speech-to-text')
autoStart?: boolean - Auto-start recording (default: false)

Returns:

isRecording: Ref<boolean> - Recording state
isProcessing: Ref<boolean> - Processing state
transcript: Ref<string> - Transcribed text
error: Ref<string | null> - Error message
startRecording() - Start recording
stopRecording() - Stop and transcribe
toggleRecording() - Toggle state

`useVoiceKitWithForms(options)`

Voice-powered form filling with AI parsing.

Options:

formStructure: FormStructure - Form field definitions
apiEndpoint?: string - Speech-to-text endpoint
parseEndpoint?: string - Form parsing endpoint (default: '/api/parse-speech')

Returns: Same as useSimpleVoiceKit plus:

formData: Ref<Record<string, any>> - Parsed form data
fillForm() - Fill form with parsed data

Server-Side Handlers

`createSpeechToTextHandler(config)`

Creates h3 handler for speech-to-text.

Config:

elevenLabsApiKey?: string - ElevenLabs API key
togetherApiKey?: string - Together AI API key
openaiApiKey?: string - OpenAI API key
speechProvider: 'elevenlabs' | 'together' | 'openai' - Provider to use

`createParseSpeechHandler(config)`

Creates h3 handler for parsing speech into form data.

Config:

openaiApiKey?: string - OpenAI API key
anthropicApiKey?: string - Anthropic API key
togetherApiKey?: string - Together AI API key
defaultProvider: 'openai' | 'anthropic' | 'together' - LLM provider
maxRetries?: number - Max retry attempts (default: 3)
timeout?: number - Request timeout ms (default: 30000)

`createLLMStatusHandler(config)`

Creates h3 handler for checking LLM availability.

Config:

openaiApiKey?: string
anthropicApiKey?: string
togetherApiKey?: string
llmProvider: 'openai' | 'anthropic' | 'together' | 'auto'
llmModel?: string - Model name (default: 'auto')

Supported Platforms

✅ Nuxt 3 - Full support with server API routes
✅ Nitro - Standalone server applications
✅ Cloudflare Workers - Serverless edge deployment
✅ Vercel - Serverless functions
✅ Node.js - Standard HTTP servers
✅ Vue 3 - Client-side composables

Speech Providers

ElevenLabs (Recommended)

High accuracy
Low latency
Supports multiple languages
Get API key: https://elevenlabs.io

OpenAI Whisper

Excellent accuracy
Supports 100+ languages
Get API key: https://platform.openai.com

Together AI

Cost-effective
Good performance
Get API key: https://together.ai

Security Best Practices

Never expose API keys - Always use environment variables on server
Use rate limiting - Protect against abuse (built-in)
Validate inputs - Use built-in validation utilities
HTTPS only - Never use HTTP in production
CORS configuration - Restrict origins in production

Troubleshooting

"No audio recorded"

Check microphone permissions in browser
Verify browser supports MediaRecorder API
Test on different browser (Chrome recommended)

"API key invalid"

Verify .env file is loaded correctly
Check API key format and validity
Ensure keys are set on server-side only

"Package not found" errors

Run npm install to ensure dependencies installed
Clear cache: npm cache clean --force
Check package version: npm list @thinkata/voice-kit

Component import errors

Ensure you're using the correct import path:
- @thinkata/voice-kit/components/VoiceInput.vue
Check that Vue is installed: npm install vue@^3.0.0

Examples

See the examples/formfiller directory for a complete working example of a Product Feedback Form - a non-PII use case demonstrating voice-powered form filling for product reviews and feedback collection.

Contributing

Contributions are welcome! Please see our GitHub repository for guidelines.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Voice Kit

Overview

⚠️ Important Notice

Features

Installation

Quick Start

Option 1: Use Pre-built Components (Easiest)

Option 2: Use Composables (More Control)

Server-Side Setup (Required)

/server/api/speech-to-text.post.ts

/server/api/parse-speech.post.ts (Optional, for form filling)

Environment Variables

API Reference

Vue Components

VoiceInput.vue

Client-Side Composables

useSimpleVoiceKit(options)

useVoiceKitWithForms(options)

Server-Side Handlers

createSpeechToTextHandler(config)

createParseSpeechHandler(config)

createLLMStatusHandler(config)

Supported Platforms

Speech Providers

ElevenLabs (Recommended)

OpenAI Whisper

Together AI

Security Best Practices

Troubleshooting

"No audio recorded"

"API key invalid"

"Package not found" errors

Component import errors

Examples

Contributing

License

Links

`/server/api/speech-to-text.post.ts`

`/server/api/parse-speech.post.ts` (Optional, for form filling)

`VoiceInput.vue`

`useSimpleVoiceKit(options)`

`useVoiceKitWithForms(options)`

`createSpeechToTextHandler(config)`

`createParseSpeechHandler(config)`

`createLLMStatusHandler(config)`