@atik9157/aiui-react-sdk

v1.0.21

Published

5 months ago

Transform any React app into a voice-controllable interface

0High
0Medium
0Low

atik9157

react voice-control ai accessibility

@atik9157/aiui-react-sdk

Transform any React application into a voice-controllable interface with AI-powered context awareness. This SDK automatically discovers your UI elements and enables natural voice interactions.

✨ Features

🎤 Voice Control - Natural voice commands to interact with your UI
🧠 Smart Context Detection - Automatically discovers and tracks interactive elements
🔄 Real-time Updates - Efficient DOM change detection with incremental context updates
🎯 Semantic Actions - Interact with elements using natural language descriptions
🔒 Privacy-First - Built-in data filtering and redaction for sensitive information
📱 Multi-Select Support - Handle complex dropdown and multi-select interactions
⚡ Optimized Performance - Smart debouncing and efficient WebSocket communication
🎨 AI-Agnostic Backend - Works with OpenAI, Claude, local models, or any AI
📦 TypeScript Support - Full type definitions included

📦 Installation

npm install @atik9157/aiui-react-sdk

Required Audio Worklet Files

Important: You must add two audio worklet processor files to your public folder for voice functionality:

1. Create `public/player-processor.js`:

/*  Audio playback worklet  */
class PlayerProcessor extends AudioWorkletProcessor {
  constructor() {
    super();
    this.queue  = [];
    this.offset = 0;
    this.port.onmessage = e => this.queue.push(e.data);
  }

  process(_, outputs) {
    const out = outputs[0][0];
    let idx   = 0;

    while (idx < out.length) {
      if (!this.queue.length) {
        out.fill(0, idx);
        break;
      }
      const buf  = this.queue[0];
      const copy = Math.min(buf.length - this.offset, out.length - idx);
      out.set(buf.subarray(this.offset, this.offset + copy), idx);

      idx        += copy;
      this.offset += copy;

      if (this.offset >= buf.length) {
        this.queue.shift();
        this.offset = 0;
      }
    }
    return true;
  }
}

registerProcessor('player-processor', PlayerProcessor);

2. Create `public/worklet-processor.js`:

/* Microphone worklet - captures and downsamples to 16kHz */
class MicProcessor extends AudioWorkletProcessor {
  constructor () {
    super();
    this.dstRate   = 16_000;
    this.frameMs   = 20;
    this.srcRate   = sampleRate;
    this.ratio     = this.srcRate / this.dstRate;
    this.samplesPerPacket = Math.round(this.dstRate * this.frameMs / 1_000);
    this.packet    = new Int16Array(this.samplesPerPacket);
    this.pIndex    = 0;
    this.acc       = 0;
    this.seq       = 0;
  }

  process (inputs) {
    const input = inputs[0];
    if (!input || !input[0]?.length) return true;

    const ch = input[0];
    for (let i = 0; i < ch.length; i++) {
      this.acc += 1;
      if (this.acc >= this.ratio) {
        const s = Math.max(-1, Math.min(1, ch[i]));
        this.packet[this.pIndex++] = s < 0 ? s * 32768 : s * 32767;
        this.acc -= this.ratio;

        if (this.pIndex === this.packet.length) {
          this.port.postMessage(this.packet.buffer, [this.packet.buffer]);
          this.packet = new Int16Array(this.samplesPerPacket);
          this.pIndex = 0;
          this.seq++;
        }
      }
    }
    return true;
  }
}

registerProcessor("mic-processor", MicProcessor);

Your project structure should look like:

your-app/
├── public/
│   ├── player-processor.js    ← Required for audio playback
│   ├── worklet-processor.js   ← Required for microphone input
│   └── index.html
├── src/
│   └── App.tsx
└── package.json

⚠️ Note: These worklet files must be in the public folder and served at /player-processor.js and /worklet-processor.js URLs. The SDK loads them at runtime for audio processing.

🚀 Quick Start

1. Wrap your app with AIUIProvider

import { AIUIProvider } from '@atik9157/aiui-react-sdk';
import type { AIUIConfig } from '@atik9157/aiui-react-sdk';

const config: AIUIConfig = {
  applicationId: 'my-awesome-app',
  serverUrl: 'wss://your-aiui-server.com',
  apiKey: 'your-api-key', // Optional
  pages: [
    {
      route: '/',
      title: 'Home',
      safeActions: ['click', 'set_value'],
    },
    {
      route: '/dashboard',
      title: 'Dashboard',
    }
  ]
};

function App() {
  return (
    <AIUIProvider config={config}>
      <YourApp />
    </AIUIProvider>
  );
}

2. Add voice control button

import { useAIUI } from '@atik9157/aiui-react-sdk';

function VoiceControlButton() {
  const { isConnected, isListening, startListening, stopListening } = useAIUI();

  return (
    <div>
      <p>Status: {isConnected ? '🟢 Connected' : '🔴 Disconnected'}</p>
      <button onClick={isListening ? stopListening : startListening}>
        {isListening ? '🎤 Listening...' : '🔇 Start Voice Control'}
      </button>
    </div>
  );
}

3. That's it! Your app is now voice-controllable 🎉

Users can now say things like:

"Click the submit button"
"Fill in the email field with [email protected]"
"Navigate to the dashboard page"
"Select Marketing and Sales from the categories dropdown"

📚 Configuration

AIUIConfig

| Property | Type | Required | Description | |----------|------|----------|-------------| | applicationId | string | ✅ | Unique identifier for your application | | serverUrl | string | ✅ | WebSocket URL of your AIUI server | | apiKey | string | ❌ | Authentication key for your server | | pages | MinimalPageConfig[] | ✅ | Array of page configurations | | safetyRules | SafetyRules | ❌ | Security and safety configurations | | privacy | PrivacyConfig | ❌ | Privacy and data filtering rules |

Page Configuration

interface MinimalPageConfig {
  route: string;              // Page route (e.g., '/dashboard')
  title?: string;             // Page title for context
  safeActions?: string[];     // Allowed actions on this page
  dangerousActions?: string[]; // Actions requiring confirmation
}

Safety Rules

Protect your users by restricting dangerous actions and sensitive areas:

safetyRules: {
  requireConfirmation: ['delete', 'submit_payment', 'purchase'],
  blockedSelectors: ['.admin-only', '[data-sensitive]'],
  allowedDomains: ['yourapp.com', 'api.yourapp.com']
}

Privacy Configuration

Automatically redact sensitive information from context:

privacy: {
  exposePasswords: false,
  exposeCreditCards: false,
  redactPatterns: ['ssn', 'social-security', 'credit-card']
}

🎯 Supported Actions

The SDK automatically detects and enables these actions on interactive elements:

| Action | Description | Example Voice Command | |--------|-------------|----------------------| | click | Click buttons, links, and interactive elements | "Click the submit button" | | set_value | Set input/textarea values | "Set email to [email protected]" | | select_from_dropdown | Select options from dropdowns | "Select Design and Marketing" | | toggle | Toggle checkboxes | "Toggle the remember me checkbox" | | navigate | Navigate between routes | "Navigate to dashboard" |

🔧 Advanced Usage

Programmatic Action Execution

Execute actions programmatically without voice:

import { useAIUI } from '@atik9157/aiui-react-sdk';

function MyComponent() {
  const { executeAction } = useAIUI();

  const handleCustomAction = async () => {
    try {
      const result = await executeAction('click', { 
        semantic: 'submit button' 
      });
      
      if (result.success) {
        console.log('Action executed successfully!');
      }
    } catch (error) {
      console.error('Action failed:', error);
    }
  };

  return <button onClick={handleCustomAction}>Execute Action</button>;
}

Index-Aware Element Selection

When multiple identical elements exist (e.g., multiple "Delete" buttons), use index notation:

// Using number notation
await executeAction('click', { semantic: 'Delete button #3' });

// Using ordinal words
await executeAction('click', { semantic: 'second delete button' });

Voice commands work the same way:

"Click the third delete button"
"Click delete button number 2"

Enhanced Multi-Select Dropdowns

For React Select or custom dropdowns, add semantic attributes to enable voice selection:

<input
  data-select-field="project-categories"
  data-select-options="Design|||Development|||Marketing|||Sales"
  placeholder="Select categories"
/>

Then users can say:

"Select Design and Marketing from project categories"
"Choose Development and Sales"

Getting Component Values

Read values from discovered elements:

const { getComponentValue } = useAIUI();

const emailValue = getComponentValue('#email-input');
console.log('Email:', emailValue);

Monitoring Context Changes

Track when the UI context changes:

const { currentPage } = useAIUI();

useEffect(() => {
  console.log('Current page:', currentPage);
}, [currentPage]);

🎨 TypeScript Support

Full TypeScript definitions are included:

import type {
  AIUIConfig,
  MinimalPageConfig,
  SafetyRules,
  PrivacyConfig,
  ActionType,
  ActionParams,
  ActionResult,
  ContextUpdate,
  DiscoveredElementState
} from '@atik9157/aiui-react-sdk';

🖥️ Development Mode

The SDK includes a development overlay showing real-time connection status:

// Automatically shown when NODE_ENV === 'development'
// Displays:
// - AIUI connection status (🟢/🔴)
// - Microphone status (🎤/🔇)
// - Current page route

The overlay appears in the bottom-right corner and helps you debug connectivity issues during development.

📖 Examples

Basic E-commerce Store

import { AIUIProvider, useAIUI } from '@atik9157/aiui-react-sdk';

const config = {
  applicationId: 'ecommerce-store',
  serverUrl: 'wss://aiui.mystore.com',
  pages: [
    { route: '/', title: 'Home' },
    { route: '/products', title: 'Products' },
    { route: '/cart', title: 'Shopping Cart' },
    { route: '/checkout', title: 'Checkout' }
  ],
  safetyRules: {
    requireConfirmation: ['place_order', 'delete_account'],
    blockedSelectors: ['.admin-panel']
  },
  privacy: {
    exposePasswords: false,
    exposeCreditCards: false
  }
};

function App() {
  return (
    <AIUIProvider config={config}>
      <Store />
    </AIUIProvider>
  );
}

Voice commands your users can use:

"Add the blue shirt to cart"
"Navigate to checkout"
"Fill shipping address with 123 Main St"
"Select express shipping"

Dashboard with Filters

<div className="filters">
  <label>Project Categories</label>
  <input
    data-select-field="categories"
    data-select-options="Frontend|||Backend|||DevOps|||Design|||Marketing"
    placeholder="Select categories..."
  />
  
  <label>Status</label>
  <input
    data-select-field="status"
    data-select-options="Active|||Pending|||Completed|||Archived"
    placeholder="Select status..."
  />
</div>

Voice commands:

"Select Frontend and Backend from categories"
"Choose Active from status"
"Select Design, Marketing and DevOps"

Form Filling

<form>
  <input 
    type="text" 
    name="name" 
    placeholder="Full Name"
    aria-label="Full Name Input"
  />
  <input 
    type="email" 
    name="email" 
    placeholder="Email Address"
    aria-label="Email Input"
  />
  <textarea 
    name="message" 
    placeholder="Your message"
    aria-label="Message Input"
  />
  <button type="submit">Submit Form</button>
</form>

Voice commands:

"Set full name to John Smith"
"Fill email with [email protected]"
"Set message to I would like more information"
"Click submit form"

🏗️ Architecture Overview

The SDK is the browser-side component that communicates with your backend server:

┌─────────────────────────────────────────────────────┐
│         Your React Application (Client)              │
│          (@atik9157/aiui-react-sdk)                 │
│                                                      │
│  ✓ Discovers interactive elements automatically    │
│  ✓ Streams UI context to your server                │
│  ✓ Executes actions from server commands            │
│  ✓ Handles voice audio I/O                          │
└──────────────────┬──────────────────────────────────┘
                   │
                   │ AIUI Protocol (WebSocket)
                   │ • Real-time context updates
                   │ • Action commands
                   │ • Bidirectional audio
                   │
                   ▼
┌─────────────────────────────────────────────────────┐
│              Your AIUI Server (Backend)              │
│         (You need to implement this)                 │
│                                                      │
│  ✓ Receives UI context from SDK                     │
│  ✓ Processes voice with STT (Speech-to-Text)        │
│  ✓ Uses AI to understand commands and context       │
│  ✓ Sends action commands back to SDK                │
│  ✓ Generates voice responses with TTS               │
│                                                      │
│  Works with: OpenAI, Claude, Gemini, Ollama, etc.   │
└─────────────────────────────────────────────────────┘

🔌 Backend Server Requirements

Your backend server needs to implement the AIUI Protocol - a WebSocket-based protocol for real-time UI control. Here's what you need:

Required WebSocket Endpoints

1. /context endpoint - For UI context and action commands

Receives UI state updates from the SDK
Sends action commands back to SDK
Query params: applicationId (required), apiKey (optional)

2. /audio endpoint - For voice interaction

Receives microphone audio: 16kHz PCM Int16Array
Sends playback audio: 24kHz PCM Int16Array

Key Message Types Your Server Will Receive

context_update - Complete UI state

{
  type: 'context_update',
  page: { route: '/dashboard', title: 'Dashboard' },
  elements: [
    {
      semantic: 'Submit button',
      type: 'button',
      actions: ['click'],
      selector: 'button.submit:nth-of-type(1)'
    }
    // ... more elements
  ],
  viewport: { width: 1920, height: 1080 }
}

context_append - New elements added (e.g., modal opened)

{
  type: 'context_append',
  elements: [/* only new elements */]
}

Messages Your Server Sends to SDK

action - Command SDK to perform action

{
  type: 'action',
  action: 'click',
  params: { semantic: 'submit button' }
}

Basic Server Implementation Example

const WebSocket = require('ws');
const wss = new WebSocket.Server({ port: 8080 });

wss.on('connection', (ws, req) => {
  const path = new URL(req.url, 'ws://localhost').pathname;
  
  if (path === '/context') {
    ws.on('message', async (data) => {
      const msg = JSON.parse(data);
      
      if (msg.type === 'context_update') {
        // 1. Get UI context
        const context = msg.context;
        
        // 2. Process with your AI (OpenAI, Claude, etc.)
        const action = await processWithAI(context, userCommand);
        
        // 3. Send action back to SDK
        ws.send(JSON.stringify({
          type: 'action',
          action: 'click',
          params: { semantic: 'submit button' }
        }));
      }
    });
  }
});

For complete protocol documentation and server examples, see the AIUI Protocol Specification.

🌐 Browser Compatibility

| Browser | Status | Notes | |---------|--------|-------| | Chrome/Edge | ✅ Full support | Recommended | | Firefox | ✅ Full support | | | Safari | ✅ Full support | Uses webkit prefix for AudioContext | | Mobile browsers | ✅ Supported | Microphone permissions required |

Minimum Requirements:

Modern browser with WebSocket support
Web Audio API support
Microphone access (for voice features)

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📝 License

🐛 Issues & Support

Found a bug or have a question?

📧 Email: [email protected]
🐛 GitHub Issues: Report a bug
📦 npm: @atik9157/aiui-react-sdk

🙏 Acknowledgments

Built with:

React
TypeScript
Web Audio API
WebSocket Protocol

Made with ❤️ by Md Mahabube Alahi Atik

Transform your React apps into voice-controllable interfaces today! 🚀

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@atik9157/aiui-react-sdk

✨ Features

📦 Installation

Required Audio Worklet Files

1. Create public/player-processor.js:

2. Create public/worklet-processor.js:

🚀 Quick Start

1. Wrap your app with AIUIProvider

2. Add voice control button

3. That's it! Your app is now voice-controllable 🎉

📚 Configuration

AIUIConfig

Page Configuration

Safety Rules

Privacy Configuration

🎯 Supported Actions

🔧 Advanced Usage

Programmatic Action Execution

Index-Aware Element Selection

Enhanced Multi-Select Dropdowns

Getting Component Values

Monitoring Context Changes

🎨 TypeScript Support

🖥️ Development Mode

📖 Examples

Basic E-commerce Store

Dashboard with Filters

Form Filling

🏗️ Architecture Overview

🔌 Backend Server Requirements

Required WebSocket Endpoints

Key Message Types Your Server Will Receive

Messages Your Server Sends to SDK

Basic Server Implementation Example

🌐 Browser Compatibility

🤝 Contributing

📝 License

🐛 Issues & Support

🙏 Acknowledgments

1. Create `public/player-processor.js`:

2. Create `public/worklet-processor.js`: