@iloveagents/foundry-voice-live-react
v0.2.1
Published
React library for Microsoft Foundry Voice Live API. Full TypeScript support with optimized presets and configuration helpers.
Maintainers
Readme
Foundry Voice Live React SDK
React hooks and components for Microsoft Foundry Voice Live API. Build real-time voice AI apps with Azure video avatars, Live2D avatars, 3D avatars, audio visualizers, function calling, and TypeScript support.
Install
npm install @iloveagents/foundry-voice-live-reactQuick Start
Region Availability: The default model (
gpt-realtime) is only available in East US 2 and Sweden Central regions. Make sure your Azure AI Foundry resource is deployed in one of these regions. See Microsoft docs for current availability.
Voice Only
import { useVoiceLive } from '@iloveagents/foundry-voice-live-react';
function App() {
const { connect, disconnect, connectionState, audioStream } = useVoiceLive({
connection: {
resourceName: 'your-foundry-resource', // Azure AI Foundry resource name
apiKey: 'your-foundry-api-key', // For dev only - see "Production" below
},
session: {
instructions: 'You are a helpful assistant.',
},
});
return (
<>
<p>Status: {connectionState}</p>
<button onClick={connect} disabled={connectionState === 'connected'}>Start</button>
<button onClick={disconnect} disabled={connectionState !== 'connected'}>Stop</button>
<audio ref={el => { if (el && audioStream) el.srcObject = audioStream; }} autoPlay />
</>
);
}With Avatar
import { useVoiceLive, VoiceLiveAvatar } from '@iloveagents/foundry-voice-live-react';
function App() {
const { videoStream, audioStream, connect, disconnect } = useVoiceLive({
connection: {
resourceName: 'your-foundry-resource',
apiKey: 'your-foundry-api-key',
},
session: {
instructions: 'You are a helpful assistant.',
voice: { name: 'en-US-AvaMultilingualNeural', type: 'azure-standard' },
avatar: { character: 'lisa', style: 'casual-sitting' },
},
});
return (
<>
<VoiceLiveAvatar videoStream={videoStream} audioStream={audioStream} />
<button onClick={connect}>Start</button>
<button onClick={disconnect}>Stop</button>
</>
);
}Microphone starts automatically when connected. No manual audio setup needed.
Production
Never expose API keys in client-side code. Use a proxy server to secure your credentials.
1. Start the Proxy
# Docker (recommended)
docker run -p 8080:8080 \
-e FOUNDRY_RESOURCE_NAME=your-foundry-resource \
-e FOUNDRY_API_KEY="your-api-key" \
-e ALLOWED_ORGINS="*" \
ghcr.io/iloveagents/foundry-voice-live-proxy:latestOr with npx:
FOUNDRY_RESOURCE_NAME=your-foundry-resource \
FOUNDRY_API_KEY="your-api-key" \
ALLOWED_ORGINS="*" \
npx @iloveagents/foundry-voice-live-proxy-node2. Connect from Your App
import { useVoiceLive } from '@iloveagents/foundry-voice-live-react';
function App() {
const { connect, disconnect, connectionState, audioStream } = useVoiceLive({
connection: {
proxyUrl: 'ws://localhost:8080/ws', // Proxy handles auth
},
session: {
instructions: 'You are a helpful assistant.',
},
});
return (
<>
<p>Status: {connectionState}</p>
<button onClick={connect}>Start</button>
<button onClick={disconnect}>Stop</button>
<audio ref={el => { if (el && audioStream) el.srcObject = audioStream; }} autoPlay />
</>
);
}Authentication Options
- API Key via Proxy — Backend holds the key, client uses
proxyUrl - MSAL Token — Pass token in query string:
proxyUrl + '?token=' + msalToken
See proxy package docs and proxy examples.
Configuration Helpers
Session Builder (Recommended)
Use the fluent sessionConfig() builder for clean, chainable configuration:
import { useVoiceLive, sessionConfig } from '@iloveagents/foundry-voice-live-react';
const config = sessionConfig()
.instructions('You are a helpful assistant.')
.hdVoice('en-US-Ava:DragonHDLatestNeural', { temperature: 0.8 })
.avatar('lisa', 'casual-sitting', { codec: 'h264' })
.semanticVAD({ multilingual: true, interruptResponse: true })
.echoCancellation()
.noiseReduction()
.build();
const { videoStream, audioStream } = useVoiceLive({
connection: { resourceName: 'your-foundry-resource', apiKey: 'your-key' },
session: config,
});Builder Methods
| Method | Description |
| ------ | ----------- |
| .instructions(text) | Set system prompt |
| .voice(name) | Set voice by name |
| .hdVoice(name, { temperature?, rate? }) | Set HD voice with options |
| .customVoice(name) | Set custom voice |
| .avatar(character, style, options?) | Configure avatar |
| .transparentBackground() | Enable chroma key background |
| .backgroundImage(url) | Set avatar background image |
| .semanticVAD(options?) | Configure turn detection (use { multilingual: true } for 10-language support) |
| .endOfUtterance(options?) | Add end-of-utterance detection |
| .noTurnDetection() | Disable turn detection (manual mode) |
| .echoCancellation() | Enable server echo cancellation |
| .noiseReduction(type?) | Enable noise reduction ('deep' or 'nearField') |
| .transcription(options?) | Configure input transcription |
| .viseme() | Enable viseme output (lip-sync) |
| .wordTimestamps() | Enable word timestamps |
| .tools(tools) | Add function tools |
| .toolChoice(choice) | Set tool choice mode |
| .build() | Build the final config |
Transcription with Phrase Lists
Improve speech recognition accuracy for specific terms:
const config = sessionConfig()
.transcription({
model: 'azure-speech',
language: 'en',
phraseList: ['Neo QLED TV', 'TUF Gaming', 'AutoQuote Explorer'],
})
.build();Note:
phraseListandcustomSpeechrequiremodel: 'azure-speech'and don't work with gpt-realtime models.
Function Calling
Define tools the AI can call, then handle execution and send results back:
import { useRef, useCallback } from 'react';
import { useVoiceLive } from '@iloveagents/foundry-voice-live-react';
const sendEventRef = useRef<(event: any) => void>(() => {});
const toolExecutor = useCallback((name: string, args: string, callId: string) => {
const parsedArgs = JSON.parse(args);
let result = {};
if (name === 'get_weather') {
result = { temperature: '72°F', location: parsedArgs.location };
}
// Send result back to the API
sendEventRef.current({
type: 'conversation.item.create',
item: { type: 'function_call_output', call_id: callId, output: JSON.stringify(result) },
});
sendEventRef.current({ type: 'response.create' });
}, []);
const { connect, sendEvent } = useVoiceLive({
connection: { resourceName: 'your-foundry-resource', apiKey: 'your-key' },
session: {
instructions: 'You can check the weather.',
tools: [{
type: 'function',
name: 'get_weather',
description: 'Get weather for a location',
parameters: { type: 'object', properties: { location: { type: 'string' } }, required: ['location'] },
}],
toolChoice: 'auto',
},
toolExecutor,
});
sendEventRef.current = sendEvent;Event Handling
const { connect } = useVoiceLive({
connection: { resourceName: 'your-foundry-resource', apiKey: 'your-key' },
onEvent: (event) => {
switch (event.type) {
case 'session.created':
console.log('Connected');
break;
case 'conversation.item.input_audio_transcription.completed':
console.log('User:', event.transcript);
break;
case 'response.audio_transcript.delta':
console.log('AI:', event.delta);
break;
case 'error':
console.error(event.error);
break;
}
},
});API
useVoiceLive(config)
Returns:
{
connectionState: 'disconnected' | 'connecting' | 'connected';
videoStream: MediaStream | null; // Avatar video
audioStream: MediaStream | null; // Audio playback
audioAnalyser: AnalyserNode | null; // For visualization
isMicActive: boolean;
connect: () => Promise<void>;
disconnect: () => void;
sendEvent: (event: any) => void;
updateSession: (config) => void;
error: string | null;
}VoiceLiveAvatar
<VoiceLiveAvatar
videoStream={videoStream} // Required: video from useVoiceLive
audioStream={audioStream} // Required: audio from useVoiceLive
enableChromaKey={true} // Remove green background
chromaKeyColor="#00FF00" // Key color
chromaKeySimilarity={0.4} // Color match threshold
chromaKeySmoothness={0.1} // Edge smoothness
loadingMessage="Loading..." // Shown before video starts
/>Examples
Working examples for all features:
| Example | Description | | ----------------------------------------------------------------------------------------------------------------------- | -------------------- | | Voice Basic | Minimal voice chat | | Voice Advanced | VAD, noise reduction | | Voice Proxy | Secure proxy pattern | | Voice MSAL | Entra ID auth | | Avatar Basic | Avatar video | | Avatar Advanced | Chroma key, 1080p | | Function Calling | Tool integration | | Audio Visualizer | Waveform display | | Viseme | Lip-sync data | | Live2D Avatar | Live2D integration | | 3D Avatar | React Three Fiber | | Agent Service | Foundry Agent |
Run examples locally:
git clone https://github.com/iLoveAgents/foundry-voice-live
cd foundry-voice-live
just install
just dev # Opens at http://localhost:3001Related
- Proxy Package - Secure WebSocket proxy for production
- Voice Live API Docs - Microsoft documentation
- Examples - Full working examples
- iLoveAgents Blog - Guides for Microsoft Foundry & Agent Framework
Support
If this library made your life easier, a coffee is a simple way to say thanks ☕ It directly supports maintenance and future features.
License
MIT - iLoveAgents
