npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@khaveeai/providers-openai-realtime

v0.2.7

Published

OpenAI Realtime API provider for Khavee AI SDK

Readme

@khaveeai/providers-openai-realtime

npm version License: MIT

OpenAI Realtime API provider for Khavee AI SDK. Seamlessly integrate real-time voice conversations with VRM avatars in React applications using OpenAI's GPT-4o Realtime API.

✨ Features

  • 🎙️ Real-time Voice Chat - WebRTC-based audio streaming with OpenAI
  • 🗣️ Automatic Lip Sync - MFCC-based phoneme detection works automatically with VRMAvatar
  • 💬 Talking Animations - Auto-plays gesture animations during AI speech
  • ⚛️ React Hooks - useRealtime() hook for easy integration
  • 🛠️ Function Calling - Full support for OpenAI tools (RAG, custom functions)
  • 📝 Live Transcription - Real-time speech-to-text with conversation history
  • 🎛️ Status Management - Track connection, listening, thinking, and speaking states
  • 🎯 Zero Backend - Direct WebRTC connection to OpenAI (no proxy needed)

📦 Installation

npm install @khaveeai/providers-openai-realtime @khaveeai/react @khaveeai/core

🚀 Quick Start with React + VRM

Here's how to create a complete VRM avatar with voice chat in just a few lines:

"use client";
import { KhaveeProvider, VRMAvatar, useRealtime } from "@khaveeai/react";
import { OpenAIRealtimeProvider } from "@khaveeai/providers-openai-realtime";
import { Canvas } from "@react-three/fiber";
import { Environment } from "@react-three/drei";

// 1. Create the provider (can be memoized with useMemo)
const realtime = new OpenAIRealtimeProvider({
  apiKey: process.env.NEXT_PUBLIC_OPENAI_API_KEY || "",
  instructions: "You are a helpful AI assistant.",
  voice: "coral"
});

// 2. Chat component using useRealtime hook
function Chat() {
  const { 
    sendMessage, 
    conversation, 
    chatStatus, 
    isConnected,
    connect,
    disconnect
  } = useRealtime();

  return (
    <div>
      {!isConnected ? (
        <button onClick={connect}>Connect to AI</button>
      ) : (
        <div>
          <div>Status: {chatStatus}</div>
          {conversation.map((msg, i) => (
            <div key={i}>{msg.role}: {msg.text}</div>
          ))}
          <button onClick={() => sendMessage("Hello!")}>Say Hello</button>
          <button onClick={disconnect}>Disconnect</button>
        </div>
      )}
    </div>
  );
}

// 3. Main app with VRM avatar
export default function App() {
  return (
    <KhaveeProvider config={{ realtime }}>
      {/* 3D VRM Avatar with automatic lip sync */}
      <Canvas>
        <VRMAvatar
          src="./models/avatar.vrm"
          position-y={-1.25}
        />
        <Environment preset="sunset" />
        <ambientLight intensity={0.5} />
      </Canvas>
      
      {/* Chat UI */}
      <Chat />
    </KhaveeProvider>
  );
}

That's it! Your VRM avatar will automatically:

  • 👄 Lip sync with the AI's voice using MFCC phoneme detection
  • 💬 Play talking/gesture animations during speech (if provided)
  • 👁️ Blink naturally for lifelike appearance

🎭 VRM Avatar Integration

Basic Setup

import { KhaveeProvider, VRMAvatar } from "@khaveeai/react";
import { OpenAIRealtimeProvider } from "@khaveeai/providers-openai-realtime";

const provider = new OpenAIRealtimeProvider({
  apiKey: "your-openai-api-key",
  voice: "coral", // Choose from: coral, shimmer, alloy, nova, echo, sage
  instructions: "Your AI personality instructions"
});

function App() {
  return (
    <KhaveeProvider config={{ realtime: provider }}>
      <Canvas>
        <VRMAvatar 
          src="./models/your-avatar.vrm"
        />
      </Canvas>
    </KhaveeProvider>
  );
}

Configuration

RealtimeConfig

interface RealtimeConfig {
  apiKey: string;                    // OpenAI API key
  model?: string;                    // Model to use (default: 'gpt-4o-realtime-preview')
  voice?: string;                    // Voice to use (default: 'shimmer')
  instructions?: string;             // System instructions
  temperature?: number;              // Response creativity (0-1)
  speed?: number;                   // Speech speed (0.25-4.0)
  language?: string;                // Language code (default: 'en')
  tools?: RealtimeTool[];           // Available functions/tools
  turnServers?: RTCIceServer[];     // Custom TURN servers
}

Available Voices

  • coral - Warm, friendly voice (recommended)
  • alloy - Balanced, versatile voice
  • echo - Deep, resonant voice
  • sage - Wise, calm voice
  • shimmer - Clear, professional voice (deprecated but still works)

⚛️ React Hook API

The useRealtime() hook provides everything you need for voice chat:

import { useRealtime } from "@khaveeai/react";

function ChatComponent() {
  const {
    // Connection
    isConnected,
    connect,
    disconnect,
    
    // Messaging
    sendMessage,
    conversation,
    chatStatus,
    
    // Lip sync (automatic with VRM)
    currentPhoneme,
    startAutoLipSync,
    stopAutoLipSync
  } = useRealtime();

  return (
    <div>
      {/* Connection Status */}
      <div className={`status ${isConnected ? 'connected' : 'disconnected'}`}>
        {chatStatus}
        {currentPhoneme && (
          <span>
            [{currentPhoneme.phoneme}] {(currentPhoneme.intensity * 100).toFixed(0)}%
          </span>
        )}
      </div>

      {/* Connection Controls */}
      {!isConnected ? (
        <button onClick={connect}>Connect to AI</button>
      ) : (
        <div>
          <button onClick={disconnect}>Disconnect</button>
          <button onClick={startAutoLipSync}>Restart Lip Sync</button>
          <button onClick={stopAutoLipSync}>Stop Lip Sync</button>
        </div>
      )}

      {/* Conversation */}
      <div className="messages">
        {conversation.map((msg, index) => (
          <div key={index} className={`message ${msg.role}`}>
            <strong>{msg.role}:</strong> {msg.text}
          </div>
        ))}
      </div>

      {/* Send Message */}
      <input
        type="text"
        onKeyPress={(e) => {
          if (e.key === 'Enter') {
            sendMessage(e.target.value);
            e.target.value = '';
          }
        }}
        disabled={!isConnected || chatStatus === "thinking"}
        placeholder="Type a message or just talk..."
      />
    </div>
  );
}

Hook Return Values

| Property | Type | Description | |----------|------|-------------| | isConnected | boolean | Connection status to OpenAI | | chatStatus | string | Current status: 'stopped', 'starting', 'ready', 'listening', 'thinking', 'speaking' | | conversation | Array | Full conversation history | | currentPhoneme | Object | Current phoneme for lip sync: {phoneme: string, intensity: number} | | connect() | Function | Connect to OpenAI Realtime API | | disconnect() | Function | Disconnect from API | | sendMessage(text) | Function | Send text message to AI | | startAutoLipSync() | Function | Manually restart lip sync | | stopAutoLipSync() | Function | Stop lip sync |

⚙️ Configuration

Provider Configuration

const realtime = new OpenAIRealtimeProvider({
  apiKey: process.env.OPENAI_API_KEY || "",
  
  // Voice & Model
  voice: "coral",                           // coral, shimmer, alloy, nova, echo, sage
  model: "gpt-4o-realtime-preview-2025-06-03",
  
  // AI Behavior
  instructions: "You are a helpful AI assistant.",
  temperature: 0.8,                         // 0-1, creativity level
  speed: 1.4,                              // 0.25-4.0, speech speed
  
  // Language & Tools
  language: "en",                          // Language code
  tools: [],                               // Function calling tools
});

Environment Variables

# .env.local
OPENAI_API_KEY=your_openai_api_key_here

Available Voices

| Voice | Description | |-------|-------------| | coral | Warm, friendly voice (recommended) | | shimmer | Clear, professional voice | | alloy | Balanced, versatile voice | | nova | Energetic, youthful voice | | echo | Deep, resonant voice | | sage | Wise, calm voice |

🛠️ Function Calling

Add custom functions that the AI can call during conversation:

// Define a weather tool
const weatherTool = {
  name: 'get_weather',
  description: 'Get current weather for a location',
  parameters: {
    location: {
      type: 'string',
      description: 'City name'
    }
  },
  execute: async (args) => {
    const weather = await fetchWeather(args.location);
    return {
      success: true,
      message: `The weather in ${args.location} is ${weather.description} with temperature ${weather.temp}°C`
    };
  }
};

// Add to provider
const realtime = new OpenAIRealtimeProvider({
  apiKey: process.env.OPENAI_API_KEY || "",
  tools: [weatherTool],
  instructions: "You can help with weather information and general questions."
});

// Or register after creation
realtime.registerFunction(weatherTool);

📱 Chat Status States

The chatStatus property provides real-time feedback:

| Status | Description | |--------|-------------| | stopped | Not connected or inactive | | starting | Initializing connection to OpenAI | | ready | Connected and ready for input | | listening | Actively listening to user speech | | thinking | Processing user input | | speaking | AI is speaking (avatar lip syncs automatically) |

🎯 Automatic Lip Sync

The provider automatically handles lip sync with VRM avatars:

  • Phoneme Detection: Real-time MFCC analysis of AI speech
  • Automatic Mapping: Maps phonemes to VRM mouth expressions
  • Zero Config: Works out of the box with VRMAvatar component
  • Manual Control: Use startAutoLipSync() and stopAutoLipSync() for custom control

Current Phoneme Info

const { currentPhoneme } = useRealtime();

// currentPhoneme structure:
{
  phoneme: "aa" | "ee" | "ou" | "ih" | "oh" | "sil", // Current phoneme
  intensity: 0.75 // Intensity level (0-1)
}

🌐 Browser Support

  • ✅ Chrome 80+
  • ✅ Firefox 78+
  • ✅ Safari 14+
  • ✅ Edge 80+

Requirements:

  • WebRTC support
  • Web Audio API
  • Microphone access (HTTPS required)

🐛 Troubleshooting

Common Issues

"Connection Failed"

# Check your API key
OPENAI_API_KEY=sk-...your_key_here

# Verify you have GPT-4o Realtime API access
# Contact OpenAI support if needed

"Microphone Not Working"

  • Ensure HTTPS is enabled (required for microphone access)
  • Check browser permissions for microphone
  • Test with other voice apps first

"Avatar Not Lip Syncing"

// Try manual restart
const { startAutoLipSync } = useRealtime();
startAutoLipSync();

// Check if phonemes are detected
const { currentPhoneme } = useRealtime();
console.log(currentPhoneme); // Should show phoneme data

"No Audio Output"

  • Check browser audio settings
  • Verify speakers/headphones are working
  • Try refreshing the page

Debug Mode

Enable detailed logging:

// Log all provider messages
const realtime = new OpenAIRealtimeProvider({
  apiKey: "your-key",
  // ... other config
});

// Add message logging
realtime.onMessage = (message) => {
  console.log('OpenAI message:', message);
};

// Add error logging  
realtime.onError = (error) => {
  console.error('Provider error:', error);
};

📄 License

MIT License - see LICENSE file for details.

🤝 Support

🚀 Contributing

Contributions welcome! Please read our contributing guidelines and submit pull requests to our GitHub repository.