dictate-button
v2.1.1
Published
Customizable Web Component that adds speech-to-text dictation capabilities to text fields
Maintainers
Readme
Dictate Button
A customizable web component that adds speech-to-text dictation capabilities to any text input, textarea field, or contenteditable element on your website.
Developed for dictate-button.io.
Features
- Easy integration with any website
- Compatible with any framework (or no framework)
- Automatic injection into text fields with the
data-dictate-button-onattribute (exclusive mode) or without thedata-dictate-button-offattribute (inclusive mode) - Simple speech-to-text functionality with clean UI
- Customizable size and API endpoint
- Dark and light theme support
- Event-based API for interaction with your application
- Built with SolidJS for optimal performance
- Accessibility is ensured with ARIA attributes, high-contrast mode support, and clear keyboard focus states
Supported tags (by our inject scripts)
- textarea
- input[type="text"]
- input[type="search"]
- input (without a type; defaults to text)
- [contenteditable] elements
Usage
Auto-inject modes
Choose the auto-inject mode that best suits your needs:
| Mode | Description | Scripts |
|---|---|---|
| Exclusive | Enables for text fields with the data-dictate-button-on attribute only. | inject-exclusive.js |
| Inclusive | Enables for text fields without the data-dictate-button-off attribute. | inject-inclusive.js |
Both auto-inject modes:
- Automatically run on DOMContentLoaded (or immediately if the DOM is already loaded).
- Watch for DOM changes to apply the dictate button to newly added elements.
- Set the button’s language from
document.documentElement.lang(if present). Long codes likeen-GBare normalized toen. - Position the button to the top right-hand corner of the text field, respecting its padding with 4px fallback if the padding is not set (0).
From CDN
Option 1: Using the exclusive auto-inject script
In your HTML <head> tag, add the following script tag:
<script type="module" crossorigin src="https://cdn.dictate-button.io/inject-exclusive.js"></script>Add the data-dictate-button-on attribute to any textarea, input[type="text"], input[type="search"], input without a type attribute, or element with the contenteditable attribute:
<textarea data-dictate-button-on></textarea>
<input type="text" data-dictate-button-on />
<input type="search" data-dictate-button-on />
<input data-dictate-button-on />
<div contenteditable data-dictate-button-on />Option 2: Using the inclusive auto-inject script
In your HTML <head> tag, add the following script tag:
<script type="module" crossorigin src="https://cdn.dictate-button.io/inject-inclusive.js"></script>All textarea, input[type="text"], input[type="search"], input elements without a type attribute, and elements with the contenteditable attribute that lack data-dictate-button-off will be automatically enhanced by default.
To disable that for a specific field, add the data-dictate-button-off attribute to it this way:
<textarea data-dictate-button-off></textarea>
<input type="text" data-dictate-button-off />
<input type="search" data-dictate-button-off />
<input data-dictate-button-off />
<div contenteditable data-dictate-button-off />Option 3: Manual integration
Import the component and use it directly in your code:
<script type="module" crossorigin src="https://cdn.dictate-button.io/dictate-button.js"></script>
<dictate-button size="30" api-endpoint="wss://api.dictate-button.io/v2/transcribe" language="en"></dictate-button>From NPM
Import once for your app:
// For selected text fields (with data-dictate-button-on attribute):
import 'dictate-button/inject-exclusive'
// or for all text fields (except those with data-dictate-button-off attribute):
import 'dictate-button/inject-inclusive'To choose between exclusive and inclusive auto-inject modes, see the Auto-inject modes section.
Advanced usage with library functions
If you need more control over when and how the dictate buttons are injected, you can use the library functions directly:
Tip: You can also import from subpaths (e.g., 'dictate-button/libs/injectDictateButton') for smaller bundles, if your bundler resolves package subpath exports.
import 'dictate-button' // Required when using library functions directly
import { injectDictateButton, injectDictateButtonOnLoad } from 'dictate-button/libs'
// Inject dictate buttons immediately to matching elements
injectDictateButton(
'textarea.custom-selector', // CSS selector for target elements
{
buttonSize: 30, // Button size in pixels (optional; default: 30)
verbose: false, // Log events to console (optional; default: false)
apiEndpoint: 'wss://api.example.com/transcribe' // Optional custom API endpoint
}
)
// Inject on DOM load with mutation observer to catch dynamically added elements
injectDictateButtonOnLoad(
'input.custom-selector', // CSS selector for target elements
{
buttonSize: 30, // Button size in pixels (optional; default: 30)
verbose: false, // Log events to console (optional; default: false)
apiEndpoint: 'wss://api.example.com/transcribe', // Optional custom API endpoint
watchDomChanges: true // Watch for DOM changes (optional; default: false)
}
)Note: the injector mirrors the target field’s display/margins into the wrapper,
sets wrapper width to 100% for block-level fields, and adds padding to avoid the button overlapping text.
The wrapper also has the dictate-button-wrapper class for easy styling.
Events
The dictate-button component emits the following events:
dictate-start: Fired when transcription starts (after microphone access is granted and WebSocket connection is established).dictate-text: Fired during transcription when text is available. This includes both interim (partial) transcripts that may change and final transcripts. The event detail contains the current transcribed text.dictate-end: Fired when transcription ends. The event detail contains the final transcribed text.dictate-error: Fired when an error occurs (microphone access denied, WebSocket connection failure, server error, etc.). The event detail contains the error message.
The typical flow is:
dictate-start -> dictate-text (multiple times) -> dictate-end
In case of an error, the dictate-error event is fired.
Example event handling:
const dictateButton = document.querySelector('dictate-button');
dictateButton.addEventListener('dictate-start', () => {
console.log('Transcription started');
});
dictateButton.addEventListener('dictate-text', (event) => {
const currentText = event.detail;
console.log('Current text:', currentText);
// Update UI with interim/partial transcription
});
dictateButton.addEventListener('dictate-end', (event) => {
const finalText = event.detail;
console.log('Final transcribed text:', finalText);
// Add the final text to your input field
document.querySelector('#my-input').value += finalText;
});
dictateButton.addEventListener('dictate-error', (event) => {
const error = event.detail;
console.error('Transcription error:', error);
});Attributes
| Attribute | Type | Default | Description | |---------------|---------|--------------------------------------------|-----------------------------------------| | size | number | 30 | Size of the button in pixels | | apiEndpoint | string | wss://api.dictate-button.io/v2/transcribe | WebSockets API endpoint of transcription service | | language | string | en | Optional language code (e.g., 'fr', 'de') | | theme | string | (inherits from page) | 'light' or 'dark' | | class | string | | Custom CSS class |
Styling
You can customize the appearance of the dictate button using CSS parts:
/* Style the button container */
dictate-button::part(container) {
/* Custom styles */
}
/* Style the button itself */
dictate-button::part(button) {
/* Custom styles */
}
/* Style the button icons */
dictate-button::part(icon) {
/* Custom styles */
}API Endpoint
By default, dictate-button uses the wss://api.dictate-button.io/v2/transcribe endpoint for real-time speech-to-text streaming.
You can specify your own endpoint by setting the apiEndpoint attribute.
The API uses WebSocket for real-time transcription:
- Protocol: WebSocket (wss://)
- Connection: Opens WebSocket connection with optional language query parameter (e.g.,
?language=en) - Audio Format: PCM16 audio data at 16kHz sample rate, sent as binary chunks
- Messages Sent:
- Binary audio data (Int16Array buffers) - Continuous stream of PCM16 audio chunks
{ type: 'close' }- JSON message to signal end of audio stream and trigger finalization
- Messages Received: JSON messages with the following types:
{ type: 'session_opened', sessionId: string, expiresAt: number }- Session started{ type: 'interim_transcript', text: string }- Interim (partial) transcription result that may change as more audio is processed{ type: 'transcript', text: string, turn_order?: number }- Final transcription result for the current turn{ type: 'session_closed', code: number, reason: string }- Session ended{ type: 'error', error: string }- Error occurred
Browser Compatibility
The dictate-button component requires the following browser features:
- Web Components
- MediaStream API (getUserMedia)
- Web Audio API (AudioContext, AudioWorklet)
- WebSocket API
Works in all modern browsers (Chrome, Firefox, Safari, Edge).
