npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

react-native-audio-agent

v0.2.0

Published

Offline react native package for voice agents

Readme

React Native Voice Agent

🎙️ Offline AI voice agent for React Native - Free to use

React Native library that provides AI voice assistant experience using Whisper for speech-to-text, offline Llama models or online providers (OpenAI, Anthropic, Google) for language modeling, and system TTS for speech synthesis.

✨ Key Features

  • 🔥 Offline + Online - Support both offline models and cloud providers (OpenAI, Anthropic, Google)
  • Flexible Costs - Free offline processing or pay-per-use online models
  • 🚀 Modern API - Builder pattern configuration, React hooks integration
  • 📱 Cross Platform - iOS and Android support with platform optimizations
  • 🎨 Customizable UI - Optional components with full styling control
  • 🧠 Smart Features - Voice Activity Detection, barge-in support, auto-mode
  • 📊 Performance Optimized - Memory management, model caching, GPU acceleration

🎬 Quick Demo

import { VoiceAgent, useVoiceAgent } from 'react-native-audio-agent';

// Create agent with offline model
const offlineAgent = VoiceAgent
  .create()
  .withWhisper('tiny.en') // 39MB, fast transcription
  .withLLM({
    provider: 'offline',
    model: 'llama-3.2-3b-instruct-q4_k_m.gguf', // 1.8GB
    maxTokens: 256,
    temperature: 0.7,
  })
  .withSystemPrompt('You are a helpful assistant.')
  .build();

// Or with online provider
const onlineAgent = VoiceAgent
  .create()
  .withWhisper('base.en')
  .withLLM({
    provider: 'openai', // 'openai' | 'anthropic' | 'google'
    apiKey: 'your-api-key',
    model: 'gpt-4',
    maxTokens: 256,
    temperature: 0.7,
  })
  .withSystemPrompt('You are a helpful assistant.')
  .build();

function VoiceChat() {
  const voice = useVoiceAgent(offlineAgent); // or onlineAgent

  return (
    <TouchableOpacity
      onPress={voice.isListening ? voice.stopListening : voice.startListening}
      style={{ backgroundColor: voice.isListening ? 'red' : 'blue' }}
    >
      <Text>
        {voice.isListening ? 'Stop Listening' : 'Start Voice Chat'}
      </Text>
    </TouchableOpacity>
  );
}

📦 Installation

npm install react-native-audio-agent
# or
yarn add react-native-audio-agent

# Install peer dependencies
npm install react-native-permissions react-native-tts react-native-fs llama.rn whisper.rn react-native-audio-recorder-player react-native-nitro-modules

iOS Setup

Add permissions to ios/YourApp/Info.plist:

<key>NSMicrophoneUsageDescription</key>
<string>This app needs microphone access for voice chat</string>
<key>NSSpeechRecognitionUsageDescription</key>
<string>This app needs speech recognition for voice chat</string>

Android Setup

Add permissions to android/app/src/main/AndroidManifest.xml:

<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />

⚠️ Important: Testing on Real Devices

This library must be tested on real physical devices, not emulators or simulators.

The voice agent downloads and runs large AI models (Whisper and Llama) locally on the device. Emulators and simulators:

  • Lack the computational resources to run these models effectively
  • May not properly handle model downloading and storage
  • Cannot accurately simulate real-world performance and memory constraints
  • May have audio recording/playback limitations

For the best development and testing experience, always use:

  • iOS: Real iPhone/iPad devices
  • Android: Physical Android devices

The models will be downloaded automatically on first use and cached locally for subsequent sessions.

🚀 Tech Stack

  • STT: whisper.rn with tiny.en model (39MB, 94.3% accuracy)
  • LLM: llama.rn with Llama 3.2 3B quantized (1.8GB)
  • TTS: react-native-tts using high-quality system voices
  • State: Zustand for internal state management
  • Audio: Advanced audio session management with barge-in support
  • Permissions: react-native-permissions for microphone access

📚 API Reference

VoiceAgent Builder

const agent = VoiceAgent.create()
  .withWhisper(
    'tiny.en' | 'base.en' | 'small.en' | 'medium.en' | 'large-v2' | 'large-v3'
  )
  .withLlama('model-filename.gguf')
  .withSystemPrompt('Your custom prompt')
  .withVoiceSettings({
    rate: 0.5, // Speech rate (0.1 - 1.0)
    pitch: 1.0, // Speech pitch (0.5 - 2.0)
    language: 'en-US', // Language code
  })
  .enableGPUAcceleration(true) // iOS Metal support
  .withMaxHistoryLength(10) // Conversation memory
  .enableVAD(true) // Voice Activity Detection
  .build();

useVoiceAgent Hook

const voice = useVoiceAgent(agent);

// Methods
voice.startListening(); // Start recording
voice.stopListening(); // Stop recording
voice.speak(text); // Speak text
voice.interruptSpeech(); // Stop current speech
voice.setSystemPrompt(); // Update AI personality
voice.clearHistory(); // Clear conversation

// State
voice.isListening; // Currently recording
voice.isThinking; // Processing speech
voice.isSpeaking; // Playing response
voice.transcript; // Last transcribed text
voice.response; // Last AI response
voice.error; // Current error state
voice.isInitialized; // Agent ready
voice.downloadProgress; // Model download status

Advanced Hook

const voice = useAdvancedVoiceAgent(agent);

// Additional features
voice.startConversation(); // Begin session
voice.endConversation(); // End session
voice.enableAutoMode(true); // Auto-continue after responses
voice.canStartRecording; // Permission check
voice.getConversationStatus(); // Detailed state info
voice.getDownloadInfo(); // Model download details

UI Components

// Simple button with built-in logic
<VoiceAgentButton
  agent={agent}
  onTranscript={(text) => console.log('User said:', text)}
  onResponse={(text) => console.log('AI replied:', text)}
  onError={(error) => Alert.alert('Error', error)}
/>

// Advanced component with controls
<AdvancedVoiceAgentButton
  agent={agent}
  showTranscript={true}
  showResponse={true}
  showStatus={true}
  compact={false}
/>

🎯 Usage Patterns

Basic Voice Chat

import { VoiceAgent, VoiceAgentButton } from 'react-native-audio-agent';

const agent = VoiceAgent
  .create()
  .withWhisper('tiny.en')
  .withLlama('llama-3.2-3b-instruct-q4_k_m.gguf')
  .withSystemPrompt('You are a helpful assistant.')
  .build();

export function BasicVoiceChat() {
  return (
    <View style={{ flex: 1, justifyContent: 'center', padding: 20 }}>
      <VoiceAgentButton agent={agent} />
    </View>
  );
}

Custom Controls

export function CustomVoiceUI() {
  const voice = useVoiceAgent(agent);

  return (
    <View>
      <Text>Status: {voice.isListening ? 'Listening...' : 'Ready'}</Text>
      <Text>You: {voice.transcript}</Text>
      <Text>AI: {voice.response}</Text>

      <TouchableOpacity onPress={voice.startListening}>
        <Text>🎤 Talk</Text>
      </TouchableOpacity>

      {voice.isSpeaking && (
        <TouchableOpacity onPress={voice.interruptSpeech}>
          <Text>⏸️ Interrupt</Text>
        </TouchableOpacity>
      )}
    </View>
  );
}

Personality Switching

const personalities = {
  assistant: 'You are a helpful AI assistant.',
  pirate: 'You are a friendly pirate. Speak like one!',
  poet: 'You are a creative poet. Respond in verse.',
};

function switchPersonality(type: keyof typeof personalities) {
  voice.setSystemPrompt(personalities[type]);
  voice.clearHistory();
}

📱 Platform Specifics

iOS Features

  • Metal GPU acceleration for Llama inference
  • Core ML acceleration for Whisper (when available)
  • AVAudioSession optimization for low latency
  • High-quality Neural TTS voices

Android Features

  • GPU acceleration via NNAPI (when supported)
  • AudioRecord optimization for real-time processing
  • System TTS with voice selection
  • Background processing permissions

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Setup

git clone https://github.com/TheWidlarzGroup/react-native-audio-agent
cd react-native-audio-agent
yarn install

# Run example app
cd example
yarn install
yarn ios # or yarn android

🙏 Acknowledgments

  • whisper.rn for Whisper integration
  • llama.rn for Llama integration
  • OpenAI for the Whisper model
  • Meta for the Llama models

Built by The Widlarz Group 🚀

Making AI voice interfaces accessible to every React Native developer