@navai/voice-backend
v0.1.3
Published
Backend helpers to mint OpenAI Realtime client secrets
Readme
@navai/voice-backend
Backend package for Navai voice applications.
This package solves three backend responsibilities:
- Mint secure ephemeral
client_secrettokens for OpenAI Realtime. - Proxy optional ElevenLabs speech synthesis for hybrid voice output.
- Discover, validate, expose, and execute backend tools from your codebase.
Installation
npm install @navai/voice-backendexpress is a peer dependency.
Architecture Overview
Runtime architecture has three layers:
src/index.tsEntry layer. Exposes public API, client secret helpers, and Express route registration.src/runtime.tsDiscovery layer. ResolvesNAVAI_FUNCTIONS_FOLDERS+NAVAI_AGENTS_FOLDERS, scans files, applies path matching rules, and builds module loaders.src/functions.tsExecution layer. Imports matched modules, transforms exports into normalized tool definitions, and executes them safely.
End-to-end request flow:
- Frontend/mobile calls
POST /navai/realtime/client-secret. - Backend validates options and API key policy.
- Backend calls OpenAI
POST https://api.openai.com/v1/realtime/client_secrets. - When
speech.provider=elevenlabs, frontend/mobile later callsPOST /navai/speech/synthesizewith final assistant text. - Frontend/mobile calls
GET /navai/functionsto discover allowed tools. - Agent calls
POST /navai/functions/executewithfunction_nameandpayload. - Backend executes only tool names loaded in the registry.
Note about agent voice state:
- The backend does not directly detect when assistant speech starts/stops, because realtime audio runs on the client side.
- That state is exposed by
@navai/voice-frontend,@navai/voice-mobile, and the WordPress widget through frontend events.
Public API
Client secret helpers:
getNavaiVoiceBackendOptionsFromEnv(env?)createRealtimeClientSecret(options, request?)createExpressClientSecretHandler(options)synthesizeSpeech(options, request?)
Express integration:
registerNavaiExpressRoutes(app, options?)
Dynamic runtime helpers:
resolveNavaiBackendRuntimeConfig(options?)loadNavaiFunctions(functionModuleLoaders)
Key exported types:
NavaiVoiceBackendOptionsCreateClientSecretRequestOpenAIRealtimeClientSecretResponseSynthesizeSpeechRequestSynthesizeSpeechResponseNavaiSpeechProviderResolveNavaiBackendRuntimeConfigOptionsNavaiFunctionDefinitionNavaiFunctionModuleLoadersNavaiFunctionsRegistry
Detailed Route Behavior
registerNavaiExpressRoutes registers these routes by default:
POST /navai/realtime/client-secretPOST /navai/speech/synthesizeGET /navai/functionsPOST /navai/functions/execute
Custom route paths are supported with:
clientSecretPathspeechSynthesizePathfunctionsListPathfunctionsExecutePath
includeFunctionsRoutes controls whether /navai/functions* routes are mounted.
Important runtime detail:
- tool runtime is lazy-loaded once and cached in-memory.
- first call to list/execute functions builds the registry.
- after initial load, file changes are not auto-reloaded unless process restarts.
Client Secret Pipeline
createRealtimeClientSecret behavior:
- Validates options.
clientSecretTtlSecondsmust be between10and7200.- if backend key is missing and request keys are not allowed, it throws.
- Resolves API key with strict priority.
- backend
openaiApiKeyalways wins. - request
apiKeyis fallback only if backend key is missing.
- Builds session payload.
modeldefault:gpt-realtimevoicedefault:marininstructionsinclude base instructions plus optional language/accent/tone lines.- when
NAVAI_TTS_PROVIDER=elevenlabs, backend requests text-only Realtime output withoutput_modalities: ["text"].
- Calls OpenAI Realtime client secret endpoint and returns:
valueexpires_atspeech.provider
Request body accepted by route:
{
"model": "gpt-realtime",
"voice": "marin",
"instructions": "You are a helpful assistant.",
"language": "Spanish",
"voiceAccent": "neutral Latin American Spanish",
"voiceTone": "friendly and professional",
"apiKey": "sk-..."
}Response:
{
"value": "ek_...",
"expires_at": 1730000000,
"speech": {
"provider": "elevenlabs"
}
}Speech synthesis request body:
{
"text": "Hola, soy NAVAI.",
"voiceId": "voice_id_optional",
"modelId": "model_id_optional"
}Speech synthesis response:
{
"provider": "elevenlabs",
"mimeType": "audio/mpeg",
"audioBase64": "SUQz..."
}Dynamic Function Loading Internals
resolveNavaiBackendRuntimeConfig reads:
- explicit options first.
- then env key
NAVAI_FUNCTIONS_FOLDERS. - then fallback default
src/ai/functions-modules.
Optional multi-agent filter:
- env key
NAVAI_AGENTS_FOLDERS - when present and
NAVAI_FUNCTIONS_FOLDERSpoints to a root folder likesrc/ai, backend only loads modules insidesrc/ai/<agent>/... - first-level agent folders can include
agent.config.ts, but backend execution only cares about matched function modules
Matcher formats accepted in NAVAI_FUNCTIONS_FOLDERS:
- folder:
src/ai/functions-modules - recursive folder:
src/ai/functions-modules/... - wildcard:
src/features/*/voice-functions - explicit file:
src/ai/functions-modules/secret.ts - CSV list:
src/ai/functions-modules,...
Scanner behavior:
- scans from
baseDirrecursively. - includes extensions from
includeExtensions(defaultts/js/mjs/cjs/mts/cts). - excludes patterns from
exclude(default ignoresnode_modules,dist, hidden paths). - ignores
*.d.ts.
Fallback behavior:
- if configured folders match nothing, warning is emitted.
- loader falls back to
src/ai/functions-modules.
Export-to-Tool Mapping Rules
loadNavaiFunctions can transform these export shapes:
- Exported function.
- creates one tool.
- Exported class.
- creates one tool per callable instance method.
- constructor args come from
payload.constructorArgs. - method args come from
payload.methodArgs.
- Exported object.
- creates one tool per callable member.
Name normalization:
- converts to snake_case lowercase.
- removes unsafe characters.
- on collisions, appends suffix (
_2,_3, ...). - emits warning whenever a rename happens.
Invocation argument resolution:
- if
payload.argsexists, it is used as argument list. - else if
payload.valueexists, it becomes first argument. - else if payload has keys, whole payload is first argument.
- if callable arity expects one more arg, context is appended.
On /navai/functions/execute, context includes { req }.
HTTP Contracts for Tools
GET /navai/functions response:
{
"items": [
{
"name": "secret_password",
"description": "Call exported function default.",
"source": "src/ai/functions-modules/security.ts#default"
}
],
"warnings": []
}POST /navai/functions/execute body:
{
"function_name": "secret_password",
"payload": {
"args": ["abc"]
}
}Success response:
{
"ok": true,
"function_name": "secret_password",
"source": "src/ai/functions-modules/security.ts#default",
"result": "..."
}Unknown function response:
{
"error": "Unknown or disallowed function.",
"available_functions": ["..."]
}Configuration and Env Rules
Main env keys:
OPENAI_API_KEYOPENAI_REALTIME_MODELOPENAI_REALTIME_VOICEOPENAI_REALTIME_INSTRUCTIONSOPENAI_REALTIME_LANGUAGEOPENAI_REALTIME_VOICE_ACCENTOPENAI_REALTIME_VOICE_TONEOPENAI_REALTIME_CLIENT_SECRET_TTLNAVAI_TTS_PROVIDERELEVENLABS_API_KEYELEVENLABS_BASE_URLELEVENLABS_VOICE_IDELEVENLABS_MODEL_IDELEVENLABS_OUTPUT_FORMATELEVENLABS_OPTIMIZE_STREAMING_LATENCYELEVENLABS_STABILITYELEVENLABS_SIMILARITY_BOOSTELEVENLABS_STYLEELEVENLABS_USE_SPEAKER_BOOSTNAVAI_ALLOW_FRONTEND_API_KEYNAVAI_FUNCTIONS_FOLDERSNAVAI_AGENTS_FOLDERSNAVAI_FUNCTIONS_BASE_DIR
API key policy from env:
- if
OPENAI_API_KEYexists, request API keys are denied unlessNAVAI_ALLOW_FRONTEND_API_KEY=true. - if
OPENAI_API_KEYis missing, request API keys are allowed as fallback.
Minimal Integration Example
import express from "express";
import { registerNavaiExpressRoutes } from "@navai/voice-backend";
const app = express();
app.use(express.json());
registerNavaiExpressRoutes(app, {
backendOptions: {
openaiApiKey: process.env.OPENAI_API_KEY,
defaultModel: "gpt-realtime",
defaultVoice: "marin",
clientSecretTtlSeconds: 600
}
});
app.listen(3000);Operational Guidance
Production recommendations:
- keep
OPENAI_API_KEYonly on server. - keep
ELEVENLABS_API_KEYonly on server. - keep
NAVAI_ALLOW_FRONTEND_API_KEY=falsein production. - whitelist CORS origins at app layer.
- monitor and surface
warningsfrom both runtime and registry. - restart backend when function files are changed if you need updated registry.
Common errors:
Missing openaiApiKey in NavaiVoiceBackendOptions.Passing apiKey from request is disabled. Set allowApiKeyFromRequest=true to enable it.clientSecretTtlSeconds must be between 10 and 7200.
Related Docs
- Package index:
README.md - Spanish version:
README.es.md - Frontend package:
../voice-frontend/README.md - Mobile package:
../voice-mobile/README.md - Playground API:
../../apps/playground-api/README.md - Playground Web:
../../apps/playground-web/README.md - Playground Mobile:
../../apps/playground-mobile/README.md
