narrator-avatar

v1.1.5

Published

a month ago

React component for 3D talking avatars with lip-sync, Deepgram/Google TTS, content-aware gestures, and pause/resume

0High
0Medium
0Low

sage-rsc

react avatar talking narrator-avatar lip-sync speech tts deepgram voice 3d

narrator-avatar

React component for 3D talking avatars with lip-sync, Deepgram or Google TTS, content-aware hand gestures, and pause/resume. Built on @met4citizen/talkinghead.

Features

3D avatars – Ready Player Me–compatible GLB models (full-body)
Lip-sync – Word-level sync with Deepgram or Google TTS (English bundled; other languages need extra setup)
TTS – Deepgram (streaming) or Google Cloud Text-to-Speech
Gestures – Content-aware hand gestures: handup, index, ok, thumbup, side, shrug
Playback – Speak, Pause, Resume, Stop (phrase-level in accurate mode)
Avatar Studio controls – Visual presets, skin gloss/pores/warmth, eye contact, and mood
Accessibility – Subtitle callback for closed captions

Install

npm install narrator-avatar

Peer dependencies (React 18+) are installed automatically. The package bundles TalkingHead, Three.js, and English lip-sync—no import maps or Vite config required.

Usage

import { useRef } from 'react';
import NarratorAvatar from 'narrator-avatar';

function MyPage() {
  const avatarRef = useRef(null);

  return (
    <div style={{ width: '400px', height: '500px' }}>
      <NarratorAvatar
        ref={avatarRef}
        avatarUrl="/avatars/brunette.glb"
        avatarBody="F"
        ttsService="deepgram"
        ttsVoice="aura-2-aurora-en"
        ttsApiKey={import.meta.env.VITE_DEEPGRAM_API_KEY}
        accurateLipSync={true}
        speechRate={0.9}
        visualPreset="beauty"
        skinGloss={1.5}
        skinPores={2}
        onReady={() => {}}
        onSpeechStart={() => {}}
        onSpeechEnd={() => {}}
        onSubtitle={() => {}}
      />
      <button onClick={() => avatarRef.current?.speakText('Hello! How are you?')}>
        Speak
      </button>
      <button onClick={() => avatarRef.current?.pauseSpeaking()}>Pause</button>
      <button onClick={() => avatarRef.current?.resumeSpeaking()}>Resume</button>
      <button onClick={() => avatarRef.current?.stopSpeaking()}>Stop</button>
      <button onClick={() => avatarRef.current?.makeEyeContact(2000)}>
        Eye contact
      </button>
    </div>
  );
}

Props

| Prop | Description | |------|-------------| | avatarUrl | URL to GLB model (e.g. /avatars/brunette.glb) | | avatarBody | 'M' or 'F' for posture | | cameraView | Camera framing: 'full', 'mid', 'upper', 'head' (default 'mid') | | cameraRotateEnable | Allow mouse drag to rotate view (default false). Set true to enable. | | cameraZoomEnable | Allow mouse wheel to zoom (default false). Set true to enable. | | cameraPanEnable | Allow mouse to pan (default false) | | ttsService | 'google' or 'deepgram' | | ttsVoice | Deepgram: e.g. aura-2-mars-en, aura-2-aurora-en. Google: e.g. en-GB-Standard-A | | ttsApiKey | API key (or set VITE_DEEPGRAM_API_KEY / VITE_GOOGLE_TTS_API_KEY) | | lipsyncModules | Array of language codes (default ['en']) | | lipsyncLang | Lip-sync language (default 'en') | | visualPreset | Lighting preset: 'cinematic', 'beauty', 'studio', 'sunset', 'broadcast' | | visualQuality | 'auto' (default): ultra on desktop, balanced on mobile/slow network. 'ultra' | 'balanced' | 'performance' | | lazyMount | true | false | 'auto' (default): defer WebGL init until visible on mobile | | skinGloss | Skin sheen/sweat intensity from 0 to 2 | | skinPores | Procedural pore/normal intensity from 0 to 2 | | skinWarmth | Warm skin tone blend from 0 to 2 | | eyeContactIntensity | Eye openness/contact strength from 0 to 2 | | modelFPS, modelPixelRatio | Render cadence and resolution multiplier. TalkingHead also multiplies modelPixelRatio by devicePixelRatio, so use values near 1 (not 4). Mobile is capped automatically. | | dracoEnabled, dracoDecoderPath | Enable Draco-compressed avatar loading | | modelDynamicBones | TalkingHead dynamic-bone config for rigged hair/body parts | | update | Per-frame callback (dt, talkingHead) | | accurateLipSync | true = REST per phrase, best lip-sync + pause/resume. Default true for stable tutor playback | | speechRate | e.g. 0.9 for 10% slower (pitch-preserving) | | speechGestures | Content-aware hand gestures (default true) | | onReady, onError, onSpeechStart, onSpeechEnd, onSubtitle | Callbacks |

Ref API

| Method / property | Description | |-------------------|-------------| | speakText(text, options?) | Speak text via TTS | | pauseSpeaking() | Pause (phrase-level when accurateLipSync is true) | | resumeSpeaking() | Resume from next phrase | | stopSpeaking() | Stop and clear | | makeEyeContact(durationMs?) | Ask the avatar to hold stronger eye contact | | setMood(mood) | Change TalkingHead mood | | setLighting(options) | Pass lighting options to TalkingHead | | setView(view, options?) | Change camera view | | playGesture(name, dur?, mirror?, ms?) | Play a built-in gesture | | playAnimation(url, onprogress?, dur?, ndx?, scale?) | Play a Mixamo/RPM FBX animation | | playPose(url, onprogress?, dur?, ndx?, scale?) | Play a Mixamo/RPM FBX pose | | stopAnimation(), stopPose() | Stop active animation or pose | | setMixerGain(speech, background?, fadeSecs?) | Adjust speech/background audio gain | | playBackgroundAudio(url), stopBackgroundAudio() | Control background audio | | isReady | Whether the avatar has finished loading | | isSpeaking | Whether the avatar is currently speaking |

Environment variables

| Variable | Use | |----------|-----| | VITE_DEEPGRAM_API_KEY | Deepgram TTS (or pass ttsApiKey prop) | | VITE_GOOGLE_TTS_API_KEY | Google TTS when ttsService="google" (or pass ttsApiKey) |

TypeScript

The package is JavaScript. For TypeScript, add a declaration file (e.g. src/narrator-avatar.d.ts) that declares the component props and ref type, or use the component with // @ts-expect-error if you prefer.

Performance (web and mobile, one GLB per avatar)

Use one model URL per avatar (e.g. /avatars/tutor.glb). The component adapts at runtime — no -mobile / -web copies.

| Mechanism | Desktop | Mobile / slow network | |-----------|---------|------------------------| | visualQuality="auto" | ultra — 60 FPS, 4K shadows, full skin polish | balanced — 24 FPS, capped resolution, no shadows, lightweight skin (TalkingHead lights only) | | lazyMount="auto" | init immediately | init when scrolled into view | | Load timeout | 30s | 90s (large GLB download) | | Tab hidden | animation paused | animation paused |

<NarratorAvatar
  avatarUrl="/avatars/test.glb"
  visualQuality="auto"
  lazyMount="auto"
/>

App tips: mount only one avatar on narrow viewports; enable gzip/brotli for static .glb on your CDN/hosting.

Keep modelPixelRatio near 1–1.25. TalkingHead multiplies it again by devicePixelRatio.

Next.js: dedupe React in next.config (see narrator-avatar-test/next.config.ts).

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

narrator-avatar

Features

Table of contents

Install

Usage

Props

Ref API

Environment variables

TypeScript

Performance (web and mobile, one GLB per avatar)

License