react-native-deepgram

v2.0.4

Published

3 months ago

React Native SDK for Deepgram's AI-powered speech-to-text, real-time transcription, and text intelligence APIs. Supports live audio streaming, file transcription, sentiment analysis, and topic detection for iOS and Android.

react-native-deepgram

react-native-deepgram brings Deepgram's AI platform to React Native & Expo.

✅ Supports Speech-to-Text v1 and the new Speech-to-Text v2 (Flux) streaming API alongside Text-to-Speech, Text Intelligence, and the Management API.

Features

🔊 Live Speech-to-Text – capture PCM audio and stream it over WebSocket (STT v1 or v2/Flux).
📄 File Transcription – send audio files/URIs to Deepgram and receive transcripts.
🎤 Text-to-Speech – synthesize speech with HTTP requests or WebSocket streaming controls.
🗣️ Voice Agent – orchestrate realtime conversational agents with microphone capture + audio playback.
🧠 Text Intelligence – summarisation, topic detection, intents, sentiment and more.
🛠️ Management API – list models, keys, usage, projects, balances, etc.
⚙️ Expo config plugin – automatic native configuration for managed and bare workflows.

Installation

yarn add react-native-deepgram
# or
npm install react-native-deepgram

iOS (CocoaPods)

cd ios && pod install

Expo

// app.config.js
module.exports = {
  expo: {
    plugins: [
      [
        'react-native-deepgram',
        {
          microphonePermission:
            'Allow $(PRODUCT_NAME) to access your microphone.',
        },
      ],
    ],
  },
};

npx expo prebuild
npx expo run:ios   # or expo run:android

Expo config plugin

The package ships with an Expo config plugin (exported from app.plugin.js) that keeps microphone permissions in sync for both platforms:

Android – automatically adds android.permission.RECORD_AUDIO to your manifest if it is missing.
iOS – sets NSMicrophoneUsageDescription with the message you provide (or a sensible fallback).

Options

You can customise the iOS prompt via the microphonePermission option:

// app.config.js
module.exports = {
  expo: {
    plugins: [
      [
        'react-native-deepgram',
        {
          microphonePermission:
            'Allow $(PRODUCT_NAME) to capture audio for real-time transcription.',
        },
      ],
    ],
  },
};

🧭 Need the plugin in a bare React Native project? Import it via require('react-native-deepgram/app.plugin.js') in your config plugin pipeline.

Configuration

import { configure } from 'react-native-deepgram';

configure({ apiKey: 'YOUR_DEEPGRAM_API_KEY' });

Heads‑up 🔐 The Management API needs a key with management scopes. Do not ship production keys in source control—prefer environment variables, Expo secrets, or a backend proxy.

Usage overview

| Hook | Purpose | | ----------------------------- | ---------------------------------------------------- | | useDeepgramVoiceAgent | Build conversational agents with streaming audio I/O | | useDeepgramSpeechToText | Live microphone streaming and file transcription | | useDeepgramTextToSpeech | Text-to-Speech synthesis (HTTP + WebSocket streaming) | | useDeepgramTextIntelligence | Text analysis (summaries, topics, intents, sentiment) | | useDeepgramManagement | Typed wrapper around the Management REST API |

💡 Pro tip: All hooks now export a state object (and other reactive values) so you can easily track connection status, errors, and transcripts without maintaining your own state.

Voice Agent (`useDeepgramVoiceAgent`)

useDeepgramVoiceAgent connects to wss://agent.deepgram.com/v1/agent/converse, captures microphone audio, and optionally auto-plays the agent's streamed responses. It wraps the full Voice Agent messaging surface so you can react to conversation updates, function calls, warnings, and raw PCM audio.

🔊 Audio Handling: This hook uses AVAudioEngine on iOS for hardware-accelerated echo cancellation, ensuring the agent doesn't hear itself speak. It also manages the audio session automatically.

Quick start

const {
  connect,
  disconnect,
  state, // { connectionState, error, warning }
  agentStatus, // { thinking, latency }
  conversation, // Array<{ role, content }>
  injectUserMessage,
  sendFunctionCallResponse,
  updatePrompt,
} = useDeepgramVoiceAgent({
  trackState: true, // Enable reactive state tracking
  trackConversation: true, // Enable conversation history tracking
  trackAgentStatus: true, // Enable agent status tracking
  autoPlayAudio: true, // Automatically play agent audio
  defaultSettings: {
    audio: {
      input: { encoding: 'linear16', sample_rate: 24_000 },
      output: { encoding: 'linear16', sample_rate: 24_000, container: 'none' },
    },
    agent: {
      language: 'en',
      greeting: 'Hello! How can I help you today?',
      listen: {
        provider: { type: 'deepgram', model: 'nova-3', smart_format: true },
      },
      think: {
        provider: { type: 'open_ai', model: 'gpt-4o', temperature: 0.7 },
        prompt: 'You are a helpful voice concierge.',
      },
      speak: {
        provider: { type: 'deepgram', model: 'aura-2-asteria-en' },
      },
    },
    tags: ['demo'],
  },
  onConversationText: (msg) => {
    console.log(`${msg.role}: ${msg.content}`);
  },
  onAgentThinking: (msg) => console.log('thinking:', msg.content),
  onAgentAudioDone: () => console.log('Agent finished speaking'),
  onServerError: (err) => console.error('Agent error', err.description),
});

const begin = async () => {
  try {
    await connect();
  } catch (err) {
    console.error('Failed to start agent', err);
  }
};

const askQuestion = () => {
  injectUserMessage("What's the weather like?");
};

const provideTooling = () => {
  sendFunctionCallResponse({
    id: 'func_12345',
    name: 'get_weather',
    content: JSON.stringify({ temperature: 72, condition: 'sunny' }),
    client_side: true,
  });
};

const rePrompt = () => {
  updatePrompt('You are now a helpful travel assistant.');
};

return (
  <>
    <Text>Status: {state.connectionState}</Text>
    <Button title="Start agent" onPress={begin} />
    <Button title="Ask" onPress={askQuestion} />
    <Button title="Send tool output" onPress={provideTooling} />
    <Button title="Update prompt" onPress={rePrompt} />
    <Button title="Stop" onPress={disconnect} />
  </>
);

💬 The hook requests mic permissions, streams PCM to Deepgram, and surfaces the agent's replies as text so nothing plays back into the microphone.

API reference (Voice Agent)

Hook props

| Prop | Type | Description | | ---- | ---- | ----------- | | endpoint | string | WebSocket endpoint used for the agent conversation (defaults to wss://agent.deepgram.com/v1/agent/converse). | | defaultSettings | DeepgramVoiceAgentSettings | Base Settings payload sent on connect; merge per-call overrides via connect(override). | | autoStartMicrophone | boolean | Automatically requests mic access and starts streaming PCM when true (default). | | autoPlayAudio | boolean | Automatically plays received audio using the native player (default: true). | | trackState | boolean | Enable reactive state tracking (connection, errors, warnings) via the state return value (default: false). | | trackConversation | boolean | Enable conversation history tracking via the conversation return value (default: false). | | trackAgentStatus | boolean | Enable agent status tracking (thinking, latency) via the agentStatus return value (default: false). | | downsampleFactor | number | Manually override the downsample ratio applied to captured audio (defaults to a heuristic based on the requested sample rate). |

Callbacks

| Callback | Signature | Fired when | | -------- | --------- | ---------- | | onBeforeConnect | () => void | connect is called—before requesting mic permissions or opening the socket. | | onConnect | () => void | The socket opens and the initial settings payload is delivered. | | onClose | (event?: any) => void | The socket closes (manual disconnect or remote). | | onError | (error: unknown) => void | Any unexpected error occurs (mic, playback, socket send, etc.). | | onMessage | (message: DeepgramVoiceAgentServerMessage) => void | Every JSON message from the Voice Agent API. | | onWelcome | (message: DeepgramVoiceAgentWelcomeMessage) => void | The agent returns the initial Welcome envelope. | | onSettingsApplied | (message: DeepgramVoiceAgentSettingsAppliedMessage) => void | Settings are acknowledged by the agent. | | onConversationText | (message: DeepgramVoiceAgentConversationTextMessage) => void | Transcript updates (role + content) arrive. | | onAgentThinking | (message: DeepgramVoiceAgentAgentThinkingMessage) => void | The agent reports internal reasoning state. | | onAgentStartedSpeaking | (message: DeepgramVoiceAgentAgentStartedSpeakingMessage) => void | A response playback session begins (latency metrics included). | | onAgentAudioDone | (message: DeepgramVoiceAgentAgentAudioDoneMessage) => void | The agent finishes emitting audio for a turn. | | onUserStartedSpeaking | (message: DeepgramVoiceAgentUserStartedSpeakingMessage) => void | Server-side VAD detects the user speaking. | | onFunctionCallRequest | (message: DeepgramVoiceAgentFunctionCallRequestMessage) => void | The agent asks the client to execute a tool marked client_side: true. | | onFunctionCallResponse | (message: DeepgramVoiceAgentReceiveFunctionCallResponseMessage) => void | The server shares the outcome of a non-client-side function call. | | onPromptUpdated | (message: DeepgramVoiceAgentPromptUpdatedMessage) => void | The active prompt is updated (e.g., after updatePrompt). | | onSpeakUpdated | (message: DeepgramVoiceAgentSpeakUpdatedMessage) => void | The active speak configuration changes (sent by the server). | | onInjectionRefused | (message: DeepgramVoiceAgentInjectionRefusedMessage) => void | An inject request is rejected (typically while the agent is speaking). | | onWarning | (message: DeepgramVoiceAgentWarningMessage) => void | The API surfaces a non-fatal warning (e.g., degraded audio quality). | | onServerError | (message: DeepgramVoiceAgentErrorMessage) => void | The API reports a structured error payload (description + code). |

Returned methods

| Method | Signature | Description | | ------ | --------- | ----------- | | connect | (settings?: DeepgramVoiceAgentSettings) => Promise<void> | Opens the socket, optionally merges additional settings, and begins microphone streaming. | | disconnect | () => void | Tears down the socket, stops recording, and removes listeners. | | sendMessage | (message: DeepgramVoiceAgentClientMessage) => boolean | Sends a pre-built client envelope (handy for custom message types). | | sendSettings | (settings: DeepgramVoiceAgentSettings) => boolean | Sends a Settings message mid-session (merged with the type field). | | injectUserMessage | (content: string) => boolean | Injects a user-side text message. | | injectAgentMessage | (message: string) => boolean | Injects an assistant-side text message. | | sendFunctionCallResponse | (response: Omit<DeepgramVoiceAgentFunctionCallResponseMessage, 'type'>) => boolean | Returns tool results for client-side function calls. | | sendKeepAlive | () => boolean | Emits a KeepAlive ping to keep the session warm. | | updatePrompt | (prompt: string) => boolean | Replaces the active system prompt. | | sendMedia | (chunk: ArrayBuffer \| Uint8Array \| number[]) => boolean | Streams additional PCM audio to the agent (e.g., pre-recorded buffers). | | isConnected | () => boolean | Returns true when the socket is open. | | clearConversation | () => void | Clears the internal conversation history. | | state | DeepgramVoiceAgentState | Reactive state object (requires trackState: true). | | conversation | DeepgramVoiceAgentConversationMessage[] | Reactive conversation history (requires trackConversation: true). | | agentStatus | DeepgramVoiceAgentStatus | Reactive agent status (requires trackAgentStatus: true). |

Settings payload (`DeepgramVoiceAgentSettings`)

| Field | Type | Purpose | | ----- | ---- | ------- | | tags | string[] | Labels applied to the session for analytics/routing. | | flags.history | boolean | Enable prior history playback to the agent. | | audio.input | DeepgramVoiceAgentAudioConfig | Configure encoding/sample rate for microphone audio. | | audio.output | DeepgramVoiceAgentAudioConfig | Choose output encoding/sample rate/bitrate for agent speech. | | agent.language | string | Primary language for the conversation. | | agent.context.messages | DeepgramVoiceAgentContextMessage[] | Seed the conversation with prior turns or system notes. | | agent.listen.provider | DeepgramVoiceAgentListenProvider | Speech recognition provider/model configuration. | | agent.think.provider | DeepgramVoiceAgentThinkProvider | LLM selection (type, model, temperature, etc.). | | agent.think.functions | DeepgramVoiceAgentFunctionConfig[] | Tooling exposed to the agent (name, parameters, optional endpoint metadata). | | agent.think.prompt | string | System prompt presented to the thinking provider. | | agent.speak.provider | Record<string, unknown> | Text-to-speech model selection for spoken replies. | | agent.greeting | string | Optional greeting played once settings are applied. | | mip_opt_out | boolean | Opt the session out of the Model Improvement Program. |

Speech-to-Text (`useDeepgramSpeechToText`)

The speech hook streams microphone audio using WebSockets and can also transcribe prerecorded audio sources. It defaults to STT v1 but automatically boots into Flux when apiVersion: 'v2' is supplied (defaulting the model to flux-general-en).

Live streaming quick start

const {
  startListening,
  stopListening,
  state, // { status, error }
  transcript, // "Hello world..."
} = useDeepgramSpeechToText({
  trackState: true,
  trackTranscript: true,
  onTranscript: console.log,
  live: {
    apiVersion: 'v2',
    model: 'flux-general-en',
    punctuate: true,
    eotThreshold: 0.55,
  },
});

<Text>Transcript: {transcript}</Text>
<Button
  title="Start"
  onPress={() => startListening({ keywords: ['Deepgram'] })}
/>
<Button title="Stop" onPress={stopListening} />

💡 When you opt into apiVersion: 'v2' the hook automatically selects flux-general-en if you do not provide a model.

File transcription quick start

const { transcribeFile } = useDeepgramSpeechToText({
  onTranscribeSuccess: (text) => console.log(text),
  prerecorded: {
    punctuate: true,
    summarize: 'v2',
  },
});

const pickFile = async () => {
  const f = await DocumentPicker.getDocumentAsync({ type: 'audio/*' });
  if (f.type === 'success') {
    await transcribeFile(f, { topics: true, intents: true });
  }
};

API reference (Speech-to-Text)

Hook props

| Prop | Type | Description | | ---------------------- | ------------------------------ | --------------------------------------------------------------- | | onBeforeStart | () => void | Invoked before requesting mic permissions or starting a stream. | | onStart | () => void | Fired once the WebSocket opens. | | onTranscript | (transcript: string) => void | Called for every transcript update (partial and final). | | onError | (error: unknown) => void | Receives streaming errors. | | onEnd | () => void | Fired when the socket closes. | | onBeforeTranscribe | () => void | Called before posting a prerecorded transcription request. | | onTranscribeSuccess | (transcript: string) => void | Receives the final transcript for prerecorded audio. | | onTranscribeError | (error: unknown) => void | Fired if prerecorded transcription fails. | | live | DeepgramLiveListenOptions | Default options merged into every live stream. | | prerecorded | DeepgramPrerecordedOptions | Default options merged into every file transcription. | | trackState | boolean | Enable reactive state tracking via the state return value (default: false). | | trackTranscript | boolean | Enable reactive transcript tracking via the transcript return value (default: false). |

Returned methods

| Method | Signature | Description | | ------------------ | -------------------------------------------------------------------------------- | --------------------------------------------------------------------------- | | startListening | (options?: DeepgramLiveListenOptions) => Promise<void> | Requests mic access, starts recording, and streams audio to Deepgram. | | stopListening | () => void | Stops recording and closes the active WebSocket. | | transcribeFile | (file: DeepgramPrerecordedSource, options?: DeepgramPrerecordedOptions) => Promise<void> | Uploads a file/URI/URL and resolves via the success/error callbacks. | | state | DeepgramSpeechToTextState | Reactive state object (requires trackState: true). | | transcript | string | Reactive final transcript (requires trackTranscript: true). | | interimTranscript| string | Reactive interim transcript (requires trackTranscript: true). |

Live transcription options (`DeepgramLiveListenOptions`)

| Option | -------------------- | apiVersion | callback | callbackMethod | channels | diarize | dictation | encoding | endpointing | extra | fillerWords | interimResults | keyterm | keywords | language | mipOptOut | model | multichannel | numerals | profanityFilter | punctuate | redact | replace | sampleRate | search | smartFormat | tag | eagerEotThreshold | eotThreshold | eotTimeoutMs | utteranceEndMs | vadEvents | version | Type | Purpose | Default | | -------------------------------------------------------- | ------------------------------------------------------------------------------------------- | ---------------------------------- | | 'v1' \| 'v2' | Selects the realtime API generation ('v2' unlocks Flux streaming). | 'v1' | | string | Webhook URL invoked when the stream finishes. | – | | 'POST' \| 'GET' \| 'PUT' \| 'DELETE' | HTTP verb Deepgram should use for callback. | 'POST' | | number | Number of audio channels in the input. | – | | boolean | Separate speakers into individual tracks. | Disabled | | boolean | Enable dictation features (punctuation, formatting). | Disabled | | DeepgramLiveListenEncoding | Audio codec supplied to Deepgram. | 'linear16' | | number \| boolean | Control endpoint detection (false disables). | – | | Record<string, string \| number \| boolean> | Attach custom metadata returned with the response. | – | | boolean | Include filler words such as "um"/"uh". | Disabled | | boolean | Emit interim (non-final) transcripts. | Disabled | | string \| string[] | Provide key terms to bias Nova-3 transcription. | – | | string \| string[] | Boost or suppress keywords. | – | | string | BCP-47 language hint (e.g. en-US). | Auto | | boolean | Opt out of the Model Improvement Program. | Disabled | | DeepgramLiveListenModel | Streaming model to request. | 'nova-2' (v1) / 'flux-general-en' (v2) | | boolean | Transcribe each channel independently. | Disabled | | boolean | Convert spoken numbers into digits. | Disabled | | boolean | Remove profanity from transcripts. | Disabled | | boolean | Auto-insert punctuation and capitalization. | Disabled | | DeepgramLiveListenRedaction \| DeepgramLiveListenRedaction[] | Remove sensitive content such as PCI data. | – | | string \| string[] | Replace specific terms in the output. | – | | number | Sample rate of the PCM audio being sent. | 16000 | | string \| string[] | Return timestamps for search terms. | – | | boolean | Apply Deepgram smart formatting. | Disabled | | string | Label the request for reporting. | – | | number | Confidence required to emit an eager turn (Flux only). | – | | number | Confidence required to finalise a turn (Flux only). | – | | number | Silence timeout before closing a turn (Flux only). | – | | number | Delay before emitting an utterance end event. | – | | boolean | Emit voice activity detection events. | Disabled | | string | Request a specific model version. | – |

Prerecorded transcription options (`DeepgramPrerecordedOptions`)

| Option | Type | Purpose | Default | | ------------------ | ------------------------------------------------- | ----------------------------------------------------------------------- | ----------------------- | | callback | string | Webhook URL invoked once transcription finishes. | – | | callbackMethod | DeepgramPrerecordedCallbackMethod | HTTP verb used for callback. | 'POST' | | extra | DeepgramPrerecordedExtra | Metadata returned with the response. | – | | sentiment | boolean | Run sentiment analysis. | Disabled | | summarize | DeepgramPrerecordedSummarize | Request AI summaries (true, 'v1', or 'v2'). | Disabled | | tag | string \| string[] | Label the request. | – | | topics | boolean | Detect topics. | Disabled | | customTopic | string \| string[] | Provide additional topics to monitor. | – | | customTopicMode | DeepgramPrerecordedCustomMode | Interpret customTopic as 'extended' or 'strict'. | 'extended' | | intents | boolean | Detect intents. | Disabled | | customIntent | string \| string[] | Provide custom intents to bias detection. | – | | customIntentMode | DeepgramPrerecordedCustomMode | Interpret customIntent as 'extended' or 'strict'. | 'extended' | | detectEntities | boolean | Extract entities (names, places, etc.). | Disabled | | detectLanguage | boolean \| string \| string[] | Auto-detect language or limit detection. | Disabled | | diarize | boolean | Enable speaker diarisation. | Disabled | | dictation | boolean | Enable dictation formatting. | Disabled | | encoding | DeepgramPrerecordedEncoding | Encoding/codec of the uploaded audio. | – | | fillerWords | boolean | Include filler words. | Disabled | | keyterm | string \| string[] | Provide key terms to bias Nova-3. | – | | keywords | string \| string[] | Boost or suppress keywords. | – | | language | string | Primary spoken language hint (BCP-47). | Auto | | measurements | boolean | Convert measurements into abbreviations. | Disabled | | model | DeepgramPrerecordedModel | Model to use for transcription. | API default | | multichannel | boolean | Transcribe each channel independently. | Disabled | | numerals | boolean | Convert spoken numbers into digits. | Disabled | | paragraphs | boolean | Split transcript into paragraphs. | Disabled | | profanityFilter | boolean | Remove profanity from the transcript. | Disabled | | punctuate | boolean | Auto-insert punctuation and capitalisation. | Disabled | | redact | DeepgramPrerecordedRedaction \| DeepgramPrerecordedRedaction[] | Remove sensitive content (PCI/PII). | – | | replace | string \| string[] | Replace specific terms in the output. | – | | search | string \| string[] | Return timestamps for search terms. | – | | smartFormat | boolean | Apply Deepgram smart formatting. | Disabled | | utterances | boolean | Return utterance-level timestamps. | Disabled | | uttSplit | number | Pause duration (seconds) used to split utterances. | – | | version | DeepgramPrerecordedVersion | Request a specific model version (e.g. 'latest'). | API default ('latest') |

Text-to-Speech (`useDeepgramTextToSpeech`)

Generate audio via a single HTTP call or stream interactive responses over WebSocket. The hook exposes granular configuration for both request paths.

HTTP synthesis quick start

const { synthesize } = useDeepgramTextToSpeech({
  options: {
    http: {
      model: 'aura-2-asteria-en',
      encoding: 'mp3',
      bitRate: 48000,
      container: 'none',
    },
  },
  onSynthesizeSuccess: (buffer) => {
    console.log('Received bytes', buffer.byteLength);
  },
});

await synthesize('Hello from Deepgram!');

Streaming quick start

const {
  startStreaming,
  sendText,
  flushStream,
  clearStream,
  closeStreamGracefully,
  stopStreaming,
  state, // { status, error }
} = useDeepgramTextToSpeech({
  trackState: true,
  autoPlayAudio: true, // Automatically play received audio
  options: {
    stream: {
      model: 'aura-2-asteria-en',
      encoding: 'linear16',
      sampleRate: 24000,
      autoFlush: false,
    },
  },
  onAudioChunk: (chunk) => console.log('Audio chunk', chunk.byteLength),
  onStreamMetadata: (meta) => console.log(meta.model_name),
});

await startStreaming('Booting stream…');
sendText('Queue another sentence', { sequenceId: 1 });
flushStream();
closeStreamGracefully();

API reference

Hook props

| Prop | Type | Description | | -------------------- | --------------------------------------------------------------------- | --------------------------------------------------------------------------- | | onBeforeSynthesize | () => void | Called before dispatching an HTTP synthesis request. | | onSynthesizeSuccess| (audio: ArrayBuffer) => void | Receives the raw audio bytes when the HTTP request succeeds. | | onSynthesizeError | (error: unknown) => void | Fired if the HTTP request fails. | | onBeforeStream | () => void | Called prior to opening the WebSocket stream. | | onStreamStart | () => void | Fired once the socket is open and ready. | | onAudioChunk | (chunk: ArrayBuffer) => void | Called for each PCM chunk received from the stream. | | onStreamMetadata | (metadata: DeepgramTextToSpeechStreamMetadataMessage) => void | Emits metadata describing the current stream. | | onStreamFlushed | (event: DeepgramTextToSpeechStreamFlushedMessage) => void | Raised when Deepgram confirms a flush. | | onStreamCleared | (event: DeepgramTextToSpeechStreamClearedMessage) => void | Raised when Deepgram confirms a clear. | | onStreamWarning | (warning: DeepgramTextToSpeechStreamWarningMessage) => void | Raised when Deepgram warns about the stream. | | onStreamError | (error: unknown) => void | Fired when the WebSocket errors. | | onStreamEnd | () => void | Fired when the stream closes (gracefully or otherwise). | | options | UseDeepgramTextToSpeechOptions | Default configuration merged into HTTP and streaming requests. | | autoPlayAudio | boolean | Automatically plays received audio using the native player (default: true). | | trackState | boolean | Enable reactive state tracking via the state return value (default: false). |

Returned methods

| Method | Signature | Description | | --------------------- | ------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- | | synthesize | (text: string) => Promise<ArrayBuffer> | Sends a single piece of text via REST and resolves with the full audio buffer. | | startStreaming | (text: string) => Promise<void> | Opens the streaming WebSocket and queues the first message. | | sendMessage | (message: DeepgramTextToSpeechStreamInputMessage) => boolean | Sends a raw control message (Text, Flush, Clear, Close) to the active stream. | | sendText | (text: string, options?: { flush?: boolean; sequenceId?: number }) => boolean | Queues additional text frames, optionally suppressing auto-flush or setting a sequence id. | | flushStream | () => boolean | Requests Deepgram to emit all buffered audio immediately. | | clearStream | () => boolean | Clears buffered text/audio without closing the socket. | | closeStreamGracefully | () => boolean | Asks Deepgram to finish outstanding audio then close the stream. | | stopStreaming | () => void | Force-closes the socket and releases resources. | | state | DeepgramTextToSpeechState | Reactive state object (requires trackState: true). |

Configuration (`UseDeepgramTextToSpeechOptions`)

UseDeepgramTextToSpeechOptions mirrors the SDK's structure and is merged into both HTTP and WebSocket requests.

| Option | Type | Applies to | Purpose | | ------------- | ------------------------------------------------ | ---------------- | ----------------------------------------------------------------------------- | | model* | DeepgramTextToSpeechModel \| (string & {}) | Both | Legacy shortcut for selecting a model (prefer per-transport model). | | encoding* | DeepgramTextToSpeechEncoding | Both | Legacy shortcut for selecting encoding (prefer http.encoding / stream.encoding). | | sampleRate* | DeepgramTextToSpeechSampleRate | Both | Legacy shortcut for sample rate (prefer transport-specific overrides). | | bitRate* | DeepgramTextToSpeechBitRate | HTTP | Legacy shortcut for bit rate. | | container* | DeepgramTextToSpeechContainer | HTTP | Legacy shortcut for container. | | format* | 'mp3' \| 'wav' \| 'opus' \| 'pcm' \| (string & {}) | HTTP | Legacy shortcut for container/format. | | callback* | string | HTTP | Legacy shortcut for callback URL. | | callbackMethod* | DeepgramTextToSpeechCallbackMethod | HTTP | Legacy shortcut for callback method. | | mipOptOut* | boolean | Both | Legacy shortcut for Model Improvement Program opt-out. | | queryParams | Record<string, string \| number \| boolean> | Both | Shared query string parameters appended to all requests. | | http | DeepgramTextToSpeechHttpOptions | HTTP | Fine-grained HTTP synthesis configuration. | | stream | DeepgramTextToSpeechStreamOptions | Streaming | Fine-grained streaming configuration. |

*Marked fields are supported for backwards compatibility but the transport-specific http/stream options are recommended.

| Option | Type | Purpose | | -------------- | -------------------------------------------------- | ----------------------------------------------------------------------------- | | model | DeepgramTextToSpeechModel \| (string & {}) | Select the TTS voice/model. | | encoding | DeepgramTextToSpeechHttpEncoding | Output audio codec. | | sampleRate | DeepgramTextToSpeechSampleRate | Output sample rate in Hz. | | container | DeepgramTextToSpeechContainer | Wrap audio in a container ('none', 'wav', 'ogg'). | | format | 'mp3' \| 'wav' \| 'opus' \| 'pcm' \| (string & {}) | Deprecated alias for container. | | bitRate | DeepgramTextToSpeechBitRate | Bit rate for compressed formats (e.g. MP3). | | callback | string | Webhook URL invoked after synthesis completes. | | callbackMethod | DeepgramTextToSpeechCallbackMethod | HTTP verb used for the callback. | | mipOptOut | boolean | Opt out of the Model Improvement Program. | | queryParams | Record<string, string \| number \| boolean> | Extra query parameters appended to the request. |

| Option | Type | Purpose | | ------------ | -------------------------------------------------- | ----------------------------------------------------------------------- | | model | DeepgramTextToSpeechModel \| (string & {}) | Select the streaming voice/model. | | encoding | DeepgramTextToSpeechStreamEncoding | Output PCM encoding for streamed chunks. | | sampleRate | DeepgramTextToSpeechSampleRate | Output sample rate in Hz. | | mipOptOut | boolean | Opt out of the Model Improvement Program. | | queryParams| Record<string, string \| number \| boolean> | Extra query parameters appended to the streaming URL. | | autoFlush | boolean | Automatically flush after each sendText call (defaults to true). |

Text Intelligence (`useDeepgramTextIntelligence`)

Run summarisation, topic detection, intent detection, sentiment analysis, and more over plain text or URLs.

const { analyze, state } = useDeepgramTextIntelligence({
  trackState: true,
  onAnalyzeSuccess: (result) => console.log(result.summary),
  options: {
    summarize: true,
    topics: true,
    intents: true,
    language: 'en-US',
  },
});

await analyze({ text: 'Deepgram makes voice data useful.' });

Options (`UseDeepgramTextIntelligenceOptions`)

| Option | Type | Purpose | | ------------------ | -------------------------------------------- | ---------------------------------------------------------------------------- | | summarize | boolean | Run summarisation on the input. | | topics | boolean | Detect topics. | | customTopic | string \| string[] | Supply additional topics to monitor. | | customTopicMode | 'extended' \| 'strict' | Interpret custom topics as additive (extended) or exact (strict). | | intents | boolean | Detect intents. | | customIntent | string \| string[] | Provide custom intents to bias detection. | | customIntentMode | 'extended' \| 'strict' | Interpret custom intents as additive (extended) or exact (strict). | | sentiment | boolean | Run sentiment analysis. | | language | DeepgramTextIntelligenceLanguage | BCP-47 language hint (defaults to 'en'). | | callback | string | Webhook URL invoked after processing completes. | | callbackMethod | 'POST' \| 'PUT' \| (string & {}) | HTTP method used for the callback. | | trackState | boolean | Enable reactive state tracking via the state return value (default: false). |

Management API (`useDeepgramManagement`)

Receive a fully typed REST client for the Deepgram Management API. No props are required.

const dg = useDeepgramManagement();

const projects = await dg.projects.list();
console.log('Projects:', projects.map((p) => p.name));

Snapshot of available groups

| Group | Representative methods | | ---------- | -------------------------------------------------------------------------------------- | | models | list(includeOutdated?), get(modelId) | | projects | list(), get(id), delete(id), patch(id, body), listModels(id) | | keys | list(projectId), create(projectId, body), get(projectId, keyId), delete(...) | | usage | listRequests(projectId), getRequest(projectId, requestId), getBreakdown(projectId) | | balances | list(projectId), get(projectId, balanceId) |

(Plus helpers for members, scopes, invitations, and purchases.)

Example app

The repository includes an Expo-managed playground under example/ that wires up every hook in this package.

1. Install workspace dependencies

git clone https://github.com/itsRares/react-native-deepgram
cd react-native-deepgram
yarn install

2. Configure your Deepgram key

Create example/.env with an Expo public key so the app can authenticate:

echo "EXPO_PUBLIC_DEEPGRAM_API_KEY=your_deepgram_key" > example/.env

You can generate API keys from the Deepgram Console. For management endpoints, ensure the key carries the right scopes.

3. Run or build the example

yarn example – start Expo bundler in development mode (web preview + QR code)
yarn example:ios – compile and launch the iOS app with expo run:ios
yarn example:android – compile and launch the Android app with expo run:android

If you prefer using bare Expo commands, cd example and run yarn start, yarn ios, or yarn android.

Roadmap

✅ Speech-to-Text (WebSocket + REST)
✅ Speech-to-Text v2 / Flux streaming support
✅ Text-to-Speech (HTTP synthesis + WebSocket streaming)
✅ Text Intelligence (summaries, topics, sentiment, intents)
✅ Management API wrapper
🚧 Detox E2E tests for the example app

Contributing

Issues and PRs are welcome—see CONTRIBUTING.md.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

react-native-deepgram

Table of contents

Features

Installation

iOS (CocoaPods)

Expo

Expo config plugin

Options

Configuration

Usage overview

Voice Agent (useDeepgramVoiceAgent)

Quick start

API reference (Voice Agent)

Hook props

Callbacks

Returned methods

Settings payload (DeepgramVoiceAgentSettings)

Speech-to-Text (useDeepgramSpeechToText)

Live streaming quick start

File transcription quick start

API reference (Speech-to-Text)

Hook props

Returned methods

Live transcription options (DeepgramLiveListenOptions)

Prerecorded transcription options (DeepgramPrerecordedOptions)

Text-to-Speech (useDeepgramTextToSpeech)

HTTP synthesis quick start

Streaming quick start

API reference

Hook props

Returned methods

Configuration (UseDeepgramTextToSpeechOptions)

Text Intelligence (useDeepgramTextIntelligence)

Options (UseDeepgramTextIntelligenceOptions)

Management API (useDeepgramManagement)

Snapshot of available groups

Example app

1. Install workspace dependencies

2. Configure your Deepgram key

3. Run or build the example

Roadmap

Contributing

License

Voice Agent (`useDeepgramVoiceAgent`)

Settings payload (`DeepgramVoiceAgentSettings`)

Speech-to-Text (`useDeepgramSpeechToText`)

Live transcription options (`DeepgramLiveListenOptions`)

Prerecorded transcription options (`DeepgramPrerecordedOptions`)

Text-to-Speech (`useDeepgramTextToSpeech`)

Configuration (`UseDeepgramTextToSpeechOptions`)

Text Intelligence (`useDeepgramTextIntelligence`)

Options (`UseDeepgramTextIntelligenceOptions`)

Management API (`useDeepgramManagement`)