talking-avatar-plugin

v1.6.1

Published

11 days ago

Universal 3D talking avatar for chat applications with speech recognition and lip-synced TTS. No API keys required. All dependencies bundled.

0High
0Medium
0Low

sanyamgoyal401

avatar talking-avatar 3d-avatar tts text-to-speech speech-recognition lip-sync chat chatbot ai-assistant voice kokoro webgpu threejs bundled

Talking Avatar Plugin

A universal JavaScript plugin that adds a 3D talking avatar to any chat application. The avatar listens to your voice, transcribes speech to text, injects it into the chat, and speaks responses with real-time lip-sync.

No API keys required - everything runs in-browser using WebGPU.

All dependencies bundled - just install and use, no additional setup needed.

Features

Voice Input: Click to speak - your voice is transcribed and sent to the chat
Voice Output: Chat responses are spoken aloud with lip-synced 3D avatar
Universal: Works with any chat interface (ChatGPT, Claude, custom apps)
Zero Config: Auto-detects chat input fields and message containers
No API Keys: Uses in-browser TTS (HeadTTS/Kokoro) and Web Speech API
All Dependencies Bundled: Three.js, TalkingHead, and HeadTTS are included - no additional installs
React/Vue/Angular Compatible: Works with controlled components, includes event callbacks
Programmatic API: Full control with speak(), startListening(), isSpeaking(), etc.
SSR/Next.js Ready: Includes browser guards for server-side rendering
Customizable: Bring your own 3D avatar, adjust voice, mood, and behavior
Smart Audio: Automatically pauses listening while avatar speaks (prevents feedback loop)
Selective Detection: Only speaks bot/assistant messages, ignores user messages

Browser Requirements

| Browser | Support | |---------|---------| | Chrome Desktop | ✅ Full support (WebGPU) | | Edge Desktop | ✅ Full support (WebGPU) | | Firefox | ⚠️ Limited (WASM fallback, slower) | | Safari | ⚠️ Limited (WASM fallback, slower) | | Mobile | ❌ Not supported (WebGPU required) |

Installation

Option 1: npm (Recommended)

npm install talking-avatar-plugin

import { TalkingAvatarPlugin } from 'talking-avatar-plugin';

TalkingAvatarPlugin.init();

Option 2: CDN (Script Tag)

<script src="https://unpkg.com/talking-avatar-plugin/dist/talking-avatar.min.js"></script>
<script>
  TalkingAvatarPlugin.init();
</script>

Or using ES modules:

<script type="module">
  import { TalkingAvatarPlugin } from 'https://unpkg.com/talking-avatar-plugin/dist/talking-avatar.esm.js';
  
  TalkingAvatarPlugin.init();
</script>

Option 3: Download

Download dist/talking-avatar.min.js and include it in your project:

<script src="talking-avatar.min.js"></script>
<script>
  TalkingAvatarPlugin.init();
</script>

Quick Start

Basic Usage

import { TalkingAvatarPlugin } from 'talking-avatar-plugin';

// Initialize with default settings
TalkingAvatarPlugin.init();

That's it! The avatar will:

Appear in the bottom-right corner
Auto-detect your chat interface
Listen when you click the mic button
Speak chat responses with lip-sync

Conversation Flow

┌─────────────────────────────────────────────────────────────────┐
│  1. USER CLICKS MIC                                             │
│     └──▶ Speech recognition starts                              │
│                                                                  │
│  2. USER SPEAKS                                                  │
│     └──▶ Voice transcribed to text                              │
│     └──▶ Text injected into chat input                          │
│     └──▶ Message auto-submitted                                 │
│                                                                  │
│  3. CHAT RESPONDS                                                │
│     └──▶ Plugin detects new assistant message                   │
│     └──▶ Mic PAUSES (prevents avatar voice being recorded)      │
│     └──▶ Avatar speaks response with lip-sync                   │
│                                                                  │
│  4. AVATAR FINISHES SPEAKING                                     │
│     └──▶ User can click mic to continue conversation            │
└─────────────────────────────────────────────────────────────────┘

Note: The microphone automatically pauses while the avatar is speaking to prevent audio feedback (the avatar's voice being picked up and sent as a new message).

With Configuration

TalkingAvatarPlugin.init({
  // Avatar appearance
  avatarUrl: './my-avatar.glb',      // Custom 3D avatar
  avatarMood: 'happy',               // Mood: neutral, happy, sad, angry
  
  // Voice settings
  ttsVoice: 'af_bella',              // Kokoro voice (see Voice Options)
  
  // Chat detection (override auto-detection)
  inputSelector: '#my-chat-input',   // CSS selector for input field
  messageSelector: '.chat-messages', // CSS selector for message container
  submitSelector: '#send-button',    // CSS selector for submit button
  
  // Behavior
  autoSubmit: true,                  // Auto-submit after speaking
  speakResponses: true,              // Speak assistant responses
  position: 'bottom-right',          // Avatar position
  size: 300,                         // Avatar size in pixels
  
  // Language
  lang: 'en-US',                     // Speech recognition language
});

Integration Examples

ChatGPT Web Interface

// ChatGPT auto-detection works out of the box
TalkingAvatarPlugin.init({
  speakResponses: true,
  autoSubmit: true,
});

Claude Web Interface

TalkingAvatarPlugin.init({
  inputSelector: 'div[contenteditable="true"]',
  messageSelector: '[class*="conversation"]',
});

Custom Chat Application

TalkingAvatarPlugin.init({
  // Point to your chat elements
  inputSelector: '#chat-input',
  messageSelector: '#messages-container',
  submitSelector: '#send-btn',
  
  // Configure behavior
  autoSubmit: true,
  speakResponses: true,
});

React Integration

import { useEffect } from 'react';
import { TalkingAvatarPlugin } from 'talking-avatar-plugin';

function ChatApp() {
  useEffect(() => {
    TalkingAvatarPlugin.init({
      inputSelector: '#chat-input',
      messageSelector: '#messages',
    });

    return () => {
      TalkingAvatarPlugin.destroy();
    };
  }, []);

  return (
    <div>
      <div id="messages">{/* Messages here */}</div>
      <input id="chat-input" type="text" />
    </div>
  );
}

Vue Integration

<template>
  <div>
    <div id="messages"><!-- Messages --></div>
    <input id="chat-input" type="text" />
  </div>
</template>

<script setup>
import { onMounted, onUnmounted } from 'vue';
import { TalkingAvatarPlugin } from 'talking-avatar-plugin';

onMounted(() => {
  TalkingAvatarPlugin.init({
    inputSelector: '#chat-input',
    messageSelector: '#messages',
  });
});

onUnmounted(() => {
  TalkingAvatarPlugin.destroy();
});
</script>

Webpack / Vite / Bundler Integration

All dependencies are bundled, so it works out of the box with any bundler:

npm install talking-avatar-plugin

import { TalkingAvatarPlugin } from 'talking-avatar-plugin';

TalkingAvatarPlugin.init({
  inputSelector: '#chat-input',
  messageSelector: '#messages',
});

No additional configuration needed.

API Reference

`TalkingAvatarPlugin.init(options)`

Initialize the plugin with optional configuration.

| Option | Type | Default | Description | |--------|------|---------|-------------| | avatarUrl | string | null | URL to custom GLB avatar file | | avatarMood | string | 'neutral' | Avatar mood: neutral, happy, sad, angry, fear, disgust, love, sleep | | ttsVoice | string | 'af_heart' | Kokoro TTS voice (see Voice Options) | | inputSelector | string | null | CSS selector for chat input (auto-detect if null) | | messageSelector | string | null | CSS selector for message container (auto-detect if null) | | submitSelector | string | null | CSS selector for submit button (auto-detect if null) | | autoSubmit | boolean | true | Automatically submit after speech recognition | | speakResponses | boolean | true | Speak assistant/bot responses | | position | string | 'bottom-right' | Avatar position: bottom-right, bottom-left, top-right, top-left | | size | number | 300 | Avatar container size in pixels | | lang | string | 'en-US' | Speech recognition language | | lipsyncLang | string | 'en' | Lip-sync language module | | mode | string | 'full' | 'full' (auto-detect & inject) or 'callback-only' (no DOM changes) |

Event Callbacks

The plugin supports comprehensive event callbacks for framework integration:

TalkingAvatarPlugin.init({
  // Speech Recognition Events
  onListeningStart: () => console.log('Mic activated'),
  onListeningEnd: () => console.log('Mic deactivated'),
  onTranscriptInterim: (text) => console.log('Partial:', text),
  onTranscript: (text) => {
    console.log('Final transcript:', text);
    // For React: update your state here
    // setMessage(text);
  },
  
  // Avatar Events
  onReady: () => console.log('Avatar ready'),
  onSpeakStart: (text) => console.log('Speaking:', text),
  onSpeakEnd: () => console.log('Done speaking'),
  
  // Submission Events
  onBeforeSubmit: (text) => {
    // Return false to prevent auto-submission
    return text.length > 0;
  },
  onAfterSubmit: (text) => console.log('Submitted:', text),
  
  // Message Detection
  onNewMessage: (message) => console.log('New message:', message.text),
  
  // Error Handling
  onError: (error, context) => console.error(context, error),
});

Callback-Only Mode (for React/Vue/Angular)

If you want full control and don't want the plugin to touch the DOM:

TalkingAvatarPlugin.init({
  mode: 'callback-only',
  onTranscript: (text) => {
    // Handle the transcript yourself
    sendMessageToChat(text);
  },
  onNewMessage: (message) => {
    // Handle new messages yourself
    TalkingAvatarPlugin.speak(message.text);
  }
});

Programmatic API

`TalkingAvatarPlugin.speak(text)`

Make the avatar speak text programmatically.

TalkingAvatarPlugin.speak("Hello! How can I help you today?");

`TalkingAvatarPlugin.stopSpeaking()`

Stop the avatar from speaking.

TalkingAvatarPlugin.stopSpeaking();

`TalkingAvatarPlugin.startListening()` / `TalkingAvatarPlugin.stopListening()`

Control the microphone programmatically.

TalkingAvatarPlugin.startListening();
// ... later
TalkingAvatarPlugin.stopListening();

`TalkingAvatarPlugin.setMood(mood)`

Change the avatar's mood/expression.

TalkingAvatarPlugin.setMood('happy');
// Options: neutral, happy, sad, angry, fear, disgust, love, sleep

`TalkingAvatarPlugin.stop()`

Stop both speaking and listening.

TalkingAvatarPlugin.stop();

`TalkingAvatarPlugin.show()` / `TalkingAvatarPlugin.hide()`

Show or hide the avatar UI.

TalkingAvatarPlugin.hide();
// ... later
TalkingAvatarPlugin.show();

State Queries

TalkingAvatarPlugin.isSpeaking();  // true if avatar is speaking
TalkingAvatarPlugin.isListening(); // true if mic is active
TalkingAvatarPlugin.isReady();     // true if fully initialized
TalkingAvatarPlugin.getMood();     // current mood string

`TalkingAvatarPlugin.destroy()`

Remove the plugin and clean up resources.

TalkingAvatarPlugin.destroy();

React Integration Example

import { useEffect, useState } from 'react';
import { TalkingAvatarPlugin } from 'talking-avatar-plugin';

function ChatApp() {
  const [message, setMessage] = useState('');
  const [latestResponse, setLatestResponse] = useState('');

  useEffect(() => {
    TalkingAvatarPlugin.init({
      mode: 'callback-only',
      onTranscript: (text) => {
        setMessage(text);
        // Or send directly: sendMessage(text);
      },
      onReady: () => console.log('Avatar ready'),
    });

    return () => TalkingAvatarPlugin.destroy();
  }, []);

  // Speak new bot responses
  useEffect(() => {
    if (latestResponse) {
      TalkingAvatarPlugin.speak(latestResponse);
    }
  }, [latestResponse]);

  return (
    <div>
      <input 
        value={message} 
        onChange={(e) => setMessage(e.target.value)} 
      />
      <button onClick={() => sendMessage(message)}>Send</button>
    </div>
  );
}

Next.js / SSR Usage

The plugin includes SSR guards and works with Next.js. Use dynamic import:

import dynamic from 'next/dynamic';
import { useEffect } from 'react';

export default function ChatPage() {
  useEffect(() => {
    // Import only on client side
    import('talking-avatar-plugin').then(({ TalkingAvatarPlugin }) => {
      TalkingAvatarPlugin.init({
        onTranscript: (text) => console.log(text),
      });
    });

    return () => {
      import('talking-avatar-plugin').then(({ TalkingAvatarPlugin }) => {
        TalkingAvatarPlugin.destroy();
      });
    };
  }, []);

  return <div>Chat App</div>;
}

Voice Options

Available Kokoro TTS voices:

| Voice ID | Description | |----------|-------------| | af_heart | American Female (default) | | af_bella | American Female - Bella | | af_nicole | American Female - Nicole | | af_sarah | American Female - Sarah | | af_sky | American Female - Sky | | am_adam | American Male - Adam | | am_michael | American Male - Michael | | bf_emma | British Female - Emma | | bf_isabella | British Female - Isabella | | bm_george | British Male - George | | bm_lewis | British Male - Lewis |

Custom Avatars

The plugin supports Ready Player Me avatars and custom GLB models.

Using Ready Player Me

Create an avatar at Ready Player Me
Get the avatar URL with morph targets:

const avatarUrl = 'https://models.readyplayer.me/YOUR_AVATAR_ID.glb?morphTargets=ARKit,Oculus+Visemes,mouthOpen,mouthSmile,eyesClosed,eyesLookUp,eyesLookDown&textureSizeLimit=1024&textureFormat=png';

TalkingAvatarPlugin.init({
  avatarUrl: avatarUrl,
});

Custom GLB Requirements

Your custom avatar must include:

Mixamo-compatible skeleton (bone structure)
ARKit blend shapes (52 facial expressions)
Oculus viseme blend shapes (15 lip-sync shapes)

See TalkingHead documentation for detailed requirements.

How It Works

┌─────────────────────────────────────────────────────────────────┐
│                        Your Chat App                             │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   ┌──────────────┐    Talking Avatar Plugin    ┌──────────────┐ │
│   │              │                              │              │ │
│   │  Chat Input  │◄─── Injects text ◄────┐     │   Messages   │ │
│   │              │                        │     │   Container  │ │
│   └──────────────┘                        │     └──────┬───────┘ │
│                                           │            │         │
└───────────────────────────────────────────┼────────────┼─────────┘
                                            │            │
                                            │            │ Observes
                                            │            ▼
┌───────────────────────────────────────────┴────────────┴─────────┐
│                     Talking Avatar Plugin                        │
│                                                                  │
│  ┌─────────────┐   ┌─────────────┐   ┌─────────────────────────┐│
│  │   Speech    │   │    Chat     │   │     Avatar Manager      ││
│  │ Recognition │──▶│  Detector   │──▶│  (TalkingHead + HeadTTS)││
│  │ (Web Speech │   │ (DOM Watch) │   │                         ││
│  │    API)     │   │             │   │  ┌─────────┐ ┌────────┐ ││
│  └─────────────┘   └─────────────┘   │  │ 3D      │ │ Voice  │ ││
│                                       │  │ Avatar  │ │ Output │ ││
│        User speaks ──▶ Text ──▶ Chat │  └─────────┘ └────────┘ ││
│        Chat responds ──▶ Avatar speaks with lip-sync           ││
│                                                                  │
└──────────────────────────────────────────────────────────────────┘

Troubleshooting

Avatar won't load

Check browser: Use Chrome or Edge desktop (WebGPU required)
Check console: Look for error messages
First load is slow: HeadTTS downloads ~80MB Kokoro model on first use

Speech recognition not working

Allow microphone: Check browser permissions
HTTPS required: Speech recognition requires secure context
Language support: Web Speech API supports limited languages

Chat not detected

Provide selectors: Use inputSelector, messageSelector, submitSelector
Check timing: The plugin waits for DOM to be ready
Console logs: Look for "Chat detector" messages

Avatar not speaking responses

Check messageSelector: Make sure it points to the message container
Response detection: Plugin looks for .assistant, .bot-message, etc.
Console logs: Look for "Detected assistant message" logs

Avatar speaking user messages (feedback loop)

This should not happen: The plugin automatically filters user messages
Check CSS classes: User messages should have .user class or similar
Check detection: Console shows what messages are detected

Project Structure

talking-avatar-plugin/
├── src/
│   ├── plugin.js             # Main entry point & initialization
│   ├── speech-recognition.js # Web Speech API wrapper
│   ├── chat-detector.js      # Universal chat element detector
│   ├── avatar-manager.js     # TalkingHead + HeadTTS integration
│   └── ui.js                 # Floating avatar UI component
├── dist/
│   ├── talking-avatar.min.js # IIFE bundle for <script> tag
│   ├── talking-avatar.esm.js # ESM bundle for imports
│   └── talking-avatar.d.ts   # TypeScript declarations
├── demo/
│   └── index.html            # Demo page with mock chat
├── package.json
├── README.md
└── LICENSE

Dependencies

The plugin automatically loads these dependencies from CDN:

TalkingHead - 3D avatar with lip-sync
HeadTTS - In-browser TTS with Kokoro
Three.js - 3D rendering

Performance Notes

First load: ~10-15 seconds (downloads Kokoro model ~80MB)
Subsequent loads: ~2-3 seconds (model cached in browser)
Memory usage: ~200-400MB (neural TTS model)
CPU/GPU: WebGPU uses GPU for TTS inference

Development

Setup

# Clone the repository
git clone https://github.com/your-repo/talking-avatar-plugin.git
cd talking-avatar-plugin

# Install dependencies
npm install

# Start development server
npm run serve
# Open http://localhost:8080/demo/

Build

# Build all bundles (ESM + IIFE)
npm run build

# Build development bundle with sourcemaps
npm run build:dev

# Watch for changes
npm run watch

Publish to npm

npm login
npm publish

License

MIT License - see LICENSE file.

Credits

TalkingHead by met4citizen
HeadTTS by met4citizen
Kokoro TTS - Neural TTS model
Three.js - 3D library

Contributing

Contributions welcome! Please read the contributing guidelines first.

Support

GitHub Issues - Bug reports and feature requests
Discussions - Questions and community