@navai/voice-mobile
v0.1.6
Published
Mobile helpers to integrate OpenAI Realtime voice with Navai backend routes
Downloads
69
Readme
@navai/voice-mobile
Mobile package to run Navai voice agents in React Native applications.
It provides a complete mobile stack for:
- Backend
client_secretretrieval. - WebRTC transport negotiation for Realtime.
- Route-aware and function-aware mobile tools.
- Local dynamic function loading.
- Realtime tool call parsing and result event emission.
- React hook lifecycle for microphone, transport, and session.
- Optional local speech playback for hybrid ElevenLabs output.
Installation
npm install @navai/voice-mobile
npm install react react-native react-native-webrtcreact-native-webrtc is a peer dependency and must exist in the consuming app.
Architecture Overview
This package is organized in layers:
- Runtime/config layer
src/runtime.ts- resolves env values, API URL, routes file, function folder filters, agent folders, and model override.
- Function layer
src/functions.ts- loads local modules and converts exports into callable function definitions.
- Agent runtime layer
src/agent.ts- builds mobile instructions and tool schemas.
- executes
navigate_toandexecute_app_function. - parses tool calls from Realtime events.
- creates response events for function_call outputs.
- Backend bridge layer
src/backend.ts- API client for Navai backend routes.
- Session orchestration layer
src/session.ts- coordinates backend client + transport.
- handles start/stop, function preloading, event forwarding.
- Transport layer
src/transport.ts- interface contract for custom transports.
src/react-native-webrtc.tsimplementation for React Native WebRTC.
- React integration layer
src/useMobileVoiceAgent.ts- hook that combines runtime, local functions, backend tools, permissions, and session state.
End-to-End Flow
Typical hook-driven flow:
- App resolves runtime config with generated module loaders.
- Hook dynamically loads
react-native-webrtc. - Hook loads local function registry from module loaders.
- On
start():
- validates runtime state.
- requests Android microphone permission when needed.
- creates backend client and WebRTC transport.
- starts mobile voice session (client secret + transport connect).
- builds mobile agent runtime (instructions + tool schemas).
- sends
session.updateevent with tools and instructions. - when
speech.provider=elevenlabs, the session is configured to receive text output and the hook can synthesize/play audio locally.
- During conversation:
- incoming Realtime events are parsed for tool calls.
- tool call outputs are emitted back via
conversation.item.createandresponse.create.
- On
stop():
- transport disconnects.
- local refs and pending tool maps are cleared.
Public API
Main exports:
resolveNavaiMobileEnv(...)resolveNavaiMobileRuntimeConfig(...)resolveNavaiMobileApplicationRuntimeConfig(...)loadNavaiFunctions(...)createNavaiMobileAgentRuntime(...)extractNavaiRealtimeToolCalls(...)buildNavaiRealtimeToolResultEvents(...)createNavaiMobileBackendClient(...)createNavaiMobileVoiceSession(...)createReactNativeWebRtcTransport(...)useMobileVoiceAgent(...)
Important types:
NavaiRouteNavaiFunctionDefinitionNavaiRealtimeTransportNavaiMobileVoiceSessionNavaiBackendSpeechConfigNavaiMobileSpeechPlayerResolveNavaiMobileApplicationRuntimeConfigResultUseMobileVoiceAgentTransportOptions
Tool Runtime Design
Mobile tool surface is intentionally stable:
navigate_toexecute_app_function
Execution behavior:
navigate_to
- validates
target. - resolves route using route matcher.
- calls
navigate(path).
execute_app_function
- validates
function_name. - tries local function first.
- falls back to backend function if local not found.
Graceful compatibility fallback:
- if model calls a function name directly as tool, runtime routes it as
execute_app_function.
Realtime Tool Event Handling
extractNavaiRealtimeToolCalls understands multiple event families:
response.function_call_arguments.doneresponse.output_item.doneresponse.output_item.addedconversation.item.createdconversation.item.addedconversation.item.doneconversation.item.retrievedresponse.done
Partial tool calls are ignored until completed status is available.
buildNavaiRealtimeToolResultEvents emits two events:
conversation.item.createwithfunction_call_output.response.createto resume model generation.
Runtime Config and Env Resolution
resolveNavaiMobileRuntimeConfig priority:
- Explicit options.
- Env object values.
- Defaults.
Keys:
NAVAI_FUNCTIONS_FOLDERSNAVAI_AGENTS_FOLDERSNAVAI_ROUTES_FILENAVAI_REALTIME_MODEL
Defaults:
- routes file:
src/ai/routes.ts - functions folder:
src/ai/functions-modules
Multi-agent layout:
NAVAI_FUNCTIONS_FOLDERS=src/aiNAVAI_AGENTS_FOLDERS=main,support,sales- local modules live in
src/ai/<agent>/... - optional per-agent config file:
src/ai/<agent>/agent.config.ts
Matcher formats:
- folder
- recursive folder (
/...) - wildcard (
*) - explicit file
- CSV list
Fallback behavior:
- configured folders with no matches emit warning.
- resolver falls back to default folder.
Current limitation:
- mobile resolves agent folders and primary agent metadata, but the mobile runtime still exposes one active tool surface (
navigate_to,execute_app_function). - official realtime handoffs are implemented in web first; mobile is prepared for the same structure but not yet switched to SDK-native handoffs.
resolveNavaiMobileApplicationRuntimeConfig also resolves:
apiBaseUrlfrom:- explicit
apiBaseUrl env.NAVAI_API_URL- explicit
defaultApiBaseUrl - default
http://localhost:3000
- explicit
- warning when generated module loader map is empty.
resolveNavaiMobileEnv lets you merge multiple env-like sources (for example Expo extra, process.env, custom config object).
Backend Client Contract
createNavaiMobileBackendClient calls:
POST /navai/realtime/client-secretPOST /navai/speech/synthesizeGET /navai/functionsPOST /navai/functions/execute
Base URL priority:
apiBaseUrloption.env.NAVAI_API_URL.- fallback
http://localhost:3000.
listFunctions returns warnings instead of throwing on most parse/network failures.
createClientSecret and executeFunction throw on request failures or invalid responses.
createClientSecret() returns { value, expires_at, speech }, where speech.provider is openai or elevenlabs.
Session Orchestrator Details
createNavaiMobileVoiceSession responsibilities:
- Function list cache.
- Session state transitions (
idle,connecting,connected,error). - Start flow:
- optional backend function preload.
- client secret request.
- transport connect with
clientSecretand optionalmodel.
- Stop flow:
- transport disconnect.
- Realtime event send helper (requires transport
sendEventimplementation).
React Native WebRTC Transport Details
createReactNativeWebRtcTransport default behavior:
- realtime endpoint:
https://api.openai.com/v1/realtime/calls - model default:
gpt-realtime - creates
RTCPeerConnection - opens data channel
oai-events - captures microphone via
mediaDevices.getUserMedia - negotiates SDP with OpenAI
- waits for data channel open before resolving connect
Resilience behavior:
- tracks transport state (
idle,connecting,connected,error,closed) - propagates connection/data channel errors via callbacks
- cleans tracks, channel, and connection on disconnect
- supports configurable remote audio track volume via private
_setVolumewhen available
React Hook Internals
useMobileVoiceAgent adds app-level behavior:
- Android microphone permission request.
- dynamic
require("react-native-webrtc"). - pending tool call queue while runtime/session is initializing.
- dedup of handled tool call ids.
- automatic
session.updateafter session starts. - optional
transportOptionspassthrough forrtcConfiguration,audioConstraints, andremoteAudioTrackVolume. - optional
speechPlayerfor local playback when backend uses ElevenLabs hybrid TTS.
Hook states:
idleconnectingconnectederror
Agent voice state exposed by the hook:
agentVoiceState:idle | speakingisAgentSpeaking:boolean
agentVoiceState is inferred from realtime audio events (response.output_audio.delta, response.output_audio.done, output_audio_buffer.started, output_audio_buffer.stopped, response.done).
Hybrid Speech Mode
When backend returns speech.provider: "elevenlabs":
useMobileVoiceAgentupdates the Realtime session to requestoutput_modalities: ["text"].- assistant final text is sent to
backendClient.synthesizeSpeech(...). - the synthesized audio is played through the app-provided
speechPlayer. - if no
speechPlayeris provided, the hook logs a warning and skips local playback.
Generated Loader CLI
This package ships:
navai-generate-mobile-loaders
Default behavior:
- Read
NAVAI_FUNCTIONS_FOLDERSandNAVAI_ROUTES_FILEfrom process env or.env. - Read
NAVAI_AGENTS_FOLDERSwhen present. - Scan
src/for source files. - Select only modules matching configured function folders.
- If agents are configured, keep only files inside
src/ai/<agent>/.... - Include route module.
- Include files referenced by route module string literals like
src/...(for screen modules). - Write
src/ai/generated-module-loaders.ts.
Useful flags:
--project-root <path>--src-root <path>--output-file <path>--env-file <path>--default-functions-folder <path>--default-routes-file <path>--type-import <module>--export-name <identifier>
Auto Setup on npm Install
Postinstall can auto-add missing scripts:
generate:ai-modules->navai-generate-mobile-loaderspredev->npm run generate:ai-modulespreandroid->npm run generate:ai-modulespreios->npm run generate:ai-modulespretypecheck->npm run generate:ai-modules
Rules:
- only missing scripts are added.
- existing scripts are never overwritten.
Disable auto setup:
NAVAI_SKIP_AUTO_SETUP=1- or
NAVAI_SKIP_MOBILE_AUTO_SETUP=1
Manual setup runner:
npx navai-setup-voice-mobileIntegration Examples
Low-level integration:
import { mediaDevices, RTCPeerConnection } from "react-native-webrtc";
import {
createNavaiMobileBackendClient,
createNavaiMobileVoiceSession,
createReactNativeWebRtcTransport
} from "@navai/voice-mobile";
const backend = createNavaiMobileBackendClient({
apiBaseUrl: "http://localhost:3000"
});
const transport = createReactNativeWebRtcTransport({
globals: { mediaDevices, RTCPeerConnection }
});
const session = createNavaiMobileVoiceSession({
backendClient: backend,
transport,
onRealtimeEvent: (event) => console.log(event),
onRealtimeError: (error) => console.error(error)
});
await session.start();Hook integration:
import { useMobileVoiceAgent } from "@navai/voice-mobile";
const voice = useMobileVoiceAgent({
runtime,
runtimeLoading,
runtimeError,
navigate: (path) => navigation.navigate(path as never)
});React Native CLI Android (opt-in transport config):
Leave transportOptions undefined in Expo if the current defaults already work for your app. In bare React Native CLI on Android, you can opt in to an explicit WebRTC transport configuration without changing Expo behavior.
import { Platform } from "react-native";
import {
useMobileVoiceAgent,
type UseMobileVoiceAgentTransportOptions
} from "@navai/voice-mobile";
const androidBareTransportOptions: UseMobileVoiceAgentTransportOptions | undefined =
Platform.OS === "android"
? {
rtcConfiguration: {
iceServers: [{ urls: ["stun:stun.l.google.com:19302"] }]
},
audioConstraints: {
audio: {
echoCancellation: true,
noiseSuppression: true,
autoGainControl: true
},
video: false
},
remoteAudioTrackVolume: 10
}
: undefined;
const voice = useMobileVoiceAgent({
runtime,
runtimeLoading,
runtimeError,
navigate: (path) => navigation.navigate(path as never),
transportOptions: androidBareTransportOptions
});Expected Backend Routes
POST /navai/realtime/client-secretPOST /navai/speech/synthesizeGET /navai/functionsPOST /navai/functions/execute
These can be provided by registerNavaiExpressRoutes from @navai/voice-backend.
Related Docs
- Spanish version:
README.es.md - English version:
README.en.md - Backend package:
../voice-backend/README.md - Frontend package:
../voice-frontend/README.md - Playground Mobile:
../../apps/playground-mobile/README.md - Playground API:
../../apps/playground-api/README.md
