@atik9157/aiui-react-sdk
v1.0.21
Published
Transform any React app into a voice-controllable interface
Maintainers
Readme
@atik9157/aiui-react-sdk
Transform any React application into a voice-controllable interface with AI-powered context awareness. This SDK automatically discovers your UI elements and enables natural voice interactions.
✨ Features
- 🎤 Voice Control - Natural voice commands to interact with your UI
- 🧠 Smart Context Detection - Automatically discovers and tracks interactive elements
- 🔄 Real-time Updates - Efficient DOM change detection with incremental context updates
- 🎯 Semantic Actions - Interact with elements using natural language descriptions
- 🔒 Privacy-First - Built-in data filtering and redaction for sensitive information
- 📱 Multi-Select Support - Handle complex dropdown and multi-select interactions
- ⚡ Optimized Performance - Smart debouncing and efficient WebSocket communication
- 🎨 AI-Agnostic Backend - Works with OpenAI, Claude, local models, or any AI
- 📦 TypeScript Support - Full type definitions included
📦 Installation
npm install @atik9157/aiui-react-sdkRequired Audio Worklet Files
Important: You must add two audio worklet processor files to your public folder for voice functionality:
1. Create public/player-processor.js:
/* Audio playback worklet */
class PlayerProcessor extends AudioWorkletProcessor {
constructor() {
super();
this.queue = [];
this.offset = 0;
this.port.onmessage = e => this.queue.push(e.data);
}
process(_, outputs) {
const out = outputs[0][0];
let idx = 0;
while (idx < out.length) {
if (!this.queue.length) {
out.fill(0, idx);
break;
}
const buf = this.queue[0];
const copy = Math.min(buf.length - this.offset, out.length - idx);
out.set(buf.subarray(this.offset, this.offset + copy), idx);
idx += copy;
this.offset += copy;
if (this.offset >= buf.length) {
this.queue.shift();
this.offset = 0;
}
}
return true;
}
}
registerProcessor('player-processor', PlayerProcessor);2. Create public/worklet-processor.js:
/* Microphone worklet - captures and downsamples to 16kHz */
class MicProcessor extends AudioWorkletProcessor {
constructor () {
super();
this.dstRate = 16_000;
this.frameMs = 20;
this.srcRate = sampleRate;
this.ratio = this.srcRate / this.dstRate;
this.samplesPerPacket = Math.round(this.dstRate * this.frameMs / 1_000);
this.packet = new Int16Array(this.samplesPerPacket);
this.pIndex = 0;
this.acc = 0;
this.seq = 0;
}
process (inputs) {
const input = inputs[0];
if (!input || !input[0]?.length) return true;
const ch = input[0];
for (let i = 0; i < ch.length; i++) {
this.acc += 1;
if (this.acc >= this.ratio) {
const s = Math.max(-1, Math.min(1, ch[i]));
this.packet[this.pIndex++] = s < 0 ? s * 32768 : s * 32767;
this.acc -= this.ratio;
if (this.pIndex === this.packet.length) {
this.port.postMessage(this.packet.buffer, [this.packet.buffer]);
this.packet = new Int16Array(this.samplesPerPacket);
this.pIndex = 0;
this.seq++;
}
}
}
return true;
}
}
registerProcessor("mic-processor", MicProcessor);Your project structure should look like:
your-app/
├── public/
│ ├── player-processor.js ← Required for audio playback
│ ├── worklet-processor.js ← Required for microphone input
│ └── index.html
├── src/
│ └── App.tsx
└── package.json⚠️ Note: These worklet files must be in the
publicfolder and served at/player-processor.jsand/worklet-processor.jsURLs. The SDK loads them at runtime for audio processing.
🚀 Quick Start
1. Wrap your app with AIUIProvider
import { AIUIProvider } from '@atik9157/aiui-react-sdk';
import type { AIUIConfig } from '@atik9157/aiui-react-sdk';
const config: AIUIConfig = {
applicationId: 'my-awesome-app',
serverUrl: 'wss://your-aiui-server.com',
apiKey: 'your-api-key', // Optional
pages: [
{
route: '/',
title: 'Home',
safeActions: ['click', 'set_value'],
},
{
route: '/dashboard',
title: 'Dashboard',
}
]
};
function App() {
return (
<AIUIProvider config={config}>
<YourApp />
</AIUIProvider>
);
}2. Add voice control button
import { useAIUI } from '@atik9157/aiui-react-sdk';
function VoiceControlButton() {
const { isConnected, isListening, startListening, stopListening } = useAIUI();
return (
<div>
<p>Status: {isConnected ? '🟢 Connected' : '🔴 Disconnected'}</p>
<button onClick={isListening ? stopListening : startListening}>
{isListening ? '🎤 Listening...' : '🔇 Start Voice Control'}
</button>
</div>
);
}3. That's it! Your app is now voice-controllable 🎉
Users can now say things like:
- "Click the submit button"
- "Fill in the email field with [email protected]"
- "Navigate to the dashboard page"
- "Select Marketing and Sales from the categories dropdown"
📚 Configuration
AIUIConfig
| Property | Type | Required | Description |
|----------|------|----------|-------------|
| applicationId | string | ✅ | Unique identifier for your application |
| serverUrl | string | ✅ | WebSocket URL of your AIUI server |
| apiKey | string | ❌ | Authentication key for your server |
| pages | MinimalPageConfig[] | ✅ | Array of page configurations |
| safetyRules | SafetyRules | ❌ | Security and safety configurations |
| privacy | PrivacyConfig | ❌ | Privacy and data filtering rules |
Page Configuration
interface MinimalPageConfig {
route: string; // Page route (e.g., '/dashboard')
title?: string; // Page title for context
safeActions?: string[]; // Allowed actions on this page
dangerousActions?: string[]; // Actions requiring confirmation
}Safety Rules
Protect your users by restricting dangerous actions and sensitive areas:
safetyRules: {
requireConfirmation: ['delete', 'submit_payment', 'purchase'],
blockedSelectors: ['.admin-only', '[data-sensitive]'],
allowedDomains: ['yourapp.com', 'api.yourapp.com']
}Privacy Configuration
Automatically redact sensitive information from context:
privacy: {
exposePasswords: false,
exposeCreditCards: false,
redactPatterns: ['ssn', 'social-security', 'credit-card']
}🎯 Supported Actions
The SDK automatically detects and enables these actions on interactive elements:
| Action | Description | Example Voice Command |
|--------|-------------|----------------------|
| click | Click buttons, links, and interactive elements | "Click the submit button" |
| set_value | Set input/textarea values | "Set email to [email protected]" |
| select_from_dropdown | Select options from dropdowns | "Select Design and Marketing" |
| toggle | Toggle checkboxes | "Toggle the remember me checkbox" |
| navigate | Navigate between routes | "Navigate to dashboard" |
🔧 Advanced Usage
Programmatic Action Execution
Execute actions programmatically without voice:
import { useAIUI } from '@atik9157/aiui-react-sdk';
function MyComponent() {
const { executeAction } = useAIUI();
const handleCustomAction = async () => {
try {
const result = await executeAction('click', {
semantic: 'submit button'
});
if (result.success) {
console.log('Action executed successfully!');
}
} catch (error) {
console.error('Action failed:', error);
}
};
return <button onClick={handleCustomAction}>Execute Action</button>;
}Index-Aware Element Selection
When multiple identical elements exist (e.g., multiple "Delete" buttons), use index notation:
// Using number notation
await executeAction('click', { semantic: 'Delete button #3' });
// Using ordinal words
await executeAction('click', { semantic: 'second delete button' });Voice commands work the same way:
- "Click the third delete button"
- "Click delete button number 2"
Enhanced Multi-Select Dropdowns
For React Select or custom dropdowns, add semantic attributes to enable voice selection:
<input
data-select-field="project-categories"
data-select-options="Design|||Development|||Marketing|||Sales"
placeholder="Select categories"
/>Then users can say:
- "Select Design and Marketing from project categories"
- "Choose Development and Sales"
Getting Component Values
Read values from discovered elements:
const { getComponentValue } = useAIUI();
const emailValue = getComponentValue('#email-input');
console.log('Email:', emailValue);Monitoring Context Changes
Track when the UI context changes:
const { currentPage } = useAIUI();
useEffect(() => {
console.log('Current page:', currentPage);
}, [currentPage]);🎨 TypeScript Support
Full TypeScript definitions are included:
import type {
AIUIConfig,
MinimalPageConfig,
SafetyRules,
PrivacyConfig,
ActionType,
ActionParams,
ActionResult,
ContextUpdate,
DiscoveredElementState
} from '@atik9157/aiui-react-sdk';🖥️ Development Mode
The SDK includes a development overlay showing real-time connection status:
// Automatically shown when NODE_ENV === 'development'
// Displays:
// - AIUI connection status (🟢/🔴)
// - Microphone status (🎤/🔇)
// - Current page routeThe overlay appears in the bottom-right corner and helps you debug connectivity issues during development.
📖 Examples
Basic E-commerce Store
import { AIUIProvider, useAIUI } from '@atik9157/aiui-react-sdk';
const config = {
applicationId: 'ecommerce-store',
serverUrl: 'wss://aiui.mystore.com',
pages: [
{ route: '/', title: 'Home' },
{ route: '/products', title: 'Products' },
{ route: '/cart', title: 'Shopping Cart' },
{ route: '/checkout', title: 'Checkout' }
],
safetyRules: {
requireConfirmation: ['place_order', 'delete_account'],
blockedSelectors: ['.admin-panel']
},
privacy: {
exposePasswords: false,
exposeCreditCards: false
}
};
function App() {
return (
<AIUIProvider config={config}>
<Store />
</AIUIProvider>
);
}Voice commands your users can use:
- "Add the blue shirt to cart"
- "Navigate to checkout"
- "Fill shipping address with 123 Main St"
- "Select express shipping"
Dashboard with Filters
<div className="filters">
<label>Project Categories</label>
<input
data-select-field="categories"
data-select-options="Frontend|||Backend|||DevOps|||Design|||Marketing"
placeholder="Select categories..."
/>
<label>Status</label>
<input
data-select-field="status"
data-select-options="Active|||Pending|||Completed|||Archived"
placeholder="Select status..."
/>
</div>Voice commands:
- "Select Frontend and Backend from categories"
- "Choose Active from status"
- "Select Design, Marketing and DevOps"
Form Filling
<form>
<input
type="text"
name="name"
placeholder="Full Name"
aria-label="Full Name Input"
/>
<input
type="email"
name="email"
placeholder="Email Address"
aria-label="Email Input"
/>
<textarea
name="message"
placeholder="Your message"
aria-label="Message Input"
/>
<button type="submit">Submit Form</button>
</form>Voice commands:
- "Set full name to John Smith"
- "Fill email with [email protected]"
- "Set message to I would like more information"
- "Click submit form"
🏗️ Architecture Overview
The SDK is the browser-side component that communicates with your backend server:
┌─────────────────────────────────────────────────────┐
│ Your React Application (Client) │
│ (@atik9157/aiui-react-sdk) │
│ │
│ ✓ Discovers interactive elements automatically │
│ ✓ Streams UI context to your server │
│ ✓ Executes actions from server commands │
│ ✓ Handles voice audio I/O │
└──────────────────┬──────────────────────────────────┘
│
│ AIUI Protocol (WebSocket)
│ • Real-time context updates
│ • Action commands
│ • Bidirectional audio
│
▼
┌─────────────────────────────────────────────────────┐
│ Your AIUI Server (Backend) │
│ (You need to implement this) │
│ │
│ ✓ Receives UI context from SDK │
│ ✓ Processes voice with STT (Speech-to-Text) │
│ ✓ Uses AI to understand commands and context │
│ ✓ Sends action commands back to SDK │
│ ✓ Generates voice responses with TTS │
│ │
│ Works with: OpenAI, Claude, Gemini, Ollama, etc. │
└─────────────────────────────────────────────────────┘🔌 Backend Server Requirements
Your backend server needs to implement the AIUI Protocol - a WebSocket-based protocol for real-time UI control. Here's what you need:
Required WebSocket Endpoints
1. /context endpoint - For UI context and action commands
- Receives UI state updates from the SDK
- Sends action commands back to SDK
- Query params:
applicationId(required),apiKey(optional)
2. /audio endpoint - For voice interaction
- Receives microphone audio: 16kHz PCM Int16Array
- Sends playback audio: 24kHz PCM Int16Array
Key Message Types Your Server Will Receive
context_update - Complete UI state
{
type: 'context_update',
page: { route: '/dashboard', title: 'Dashboard' },
elements: [
{
semantic: 'Submit button',
type: 'button',
actions: ['click'],
selector: 'button.submit:nth-of-type(1)'
}
// ... more elements
],
viewport: { width: 1920, height: 1080 }
}context_append - New elements added (e.g., modal opened)
{
type: 'context_append',
elements: [/* only new elements */]
}Messages Your Server Sends to SDK
action - Command SDK to perform action
{
type: 'action',
action: 'click',
params: { semantic: 'submit button' }
}Basic Server Implementation Example
const WebSocket = require('ws');
const wss = new WebSocket.Server({ port: 8080 });
wss.on('connection', (ws, req) => {
const path = new URL(req.url, 'ws://localhost').pathname;
if (path === '/context') {
ws.on('message', async (data) => {
const msg = JSON.parse(data);
if (msg.type === 'context_update') {
// 1. Get UI context
const context = msg.context;
// 2. Process with your AI (OpenAI, Claude, etc.)
const action = await processWithAI(context, userCommand);
// 3. Send action back to SDK
ws.send(JSON.stringify({
type: 'action',
action: 'click',
params: { semantic: 'submit button' }
}));
}
});
}
});For complete protocol documentation and server examples, see the AIUI Protocol Specification.
🌐 Browser Compatibility
| Browser | Status | Notes | |---------|--------|-------| | Chrome/Edge | ✅ Full support | Recommended | | Firefox | ✅ Full support | | | Safari | ✅ Full support | Uses webkit prefix for AudioContext | | Mobile browsers | ✅ Supported | Microphone permissions required |
Minimum Requirements:
- Modern browser with WebSocket support
- Web Audio API support
- Microphone access (for voice features)
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
📝 License
MIT © Md Mahabube Alahi Atik
🐛 Issues & Support
Found a bug or have a question?
- 📧 Email: [email protected]
- 🐛 GitHub Issues: Report a bug
- 📦 npm: @atik9157/aiui-react-sdk
🙏 Acknowledgments
Built with:
- React
- TypeScript
- Web Audio API
- WebSocket Protocol
Made with ❤️ by Md Mahabube Alahi Atik
Transform your React apps into voice-controllable interfaces today! 🚀
