@codmir/realtime
v0.1.1
Published
Low-latency voice infrastructure with Hive Mode DSP
Maintainers
Readme
@codmir/realtime
Low-latency voice infrastructure with Hive Mode DSP - the collective consciousness voice effect.
Features
- Contract-driven AI execution - Modal.com (now) → AWS (future)
- Event streaming - SSE for Vercel Edge
- Streaming TTS - ElevenLabs, Local (Coqui, Piper)
- Hive Mode DSP - Multi-layered collective voice effect
- Low-latency transport - WebRTC, WebSocket
Architecture
┌─────────────────────────────────────────────────────────────────────┐
│ @codmir/realtime Pipeline │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ User Input │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Runner │ ← Modal (now) / AWS (future) / Local │
│ └─────────────┘ │
│ │ Streaming text │
│ ▼ │
│ ┌─────────────┐ │
│ │ Chunker │ ← Semantic text segmentation │
│ └─────────────┘ │
│ │ Text chunks │
│ ▼ │
│ ┌─────────────┐ │
│ │ TTS │ ← ElevenLabs / Local │
│ └─────────────┘ │
│ │ Audio chunks │
│ ▼ │
│ ┌─────────────┐ │
│ │ Hive DSP │ ← Multi-layer voice effect │
│ └─────────────┘ │
│ │ Processed audio │
│ ▼ │
│ ┌─────────────┐ │
│ │ Transport │ ← WebRTC / WebSocket │
│ └─────────────┘ │
│ │ │
│ ▼ │
│ Client │
│ │
└─────────────────────────────────────────────────────────────────────┘Installation
pnpm add @codmir/realtimeQuick Start
import {
createRealtimeSession,
ModalRunner,
ElevenLabsTTS,
} from '@codmir/realtime';
// Create session
const session = createRealtimeSession({
runner: new ModalRunner({
apiKey: process.env.MODAL_API_KEY,
}),
tts: new ElevenLabsTTS({
apiKey: process.env.ELEVENLABS_API_KEY,
}),
voiceMode: 'hive',
});
// Start session
await session.start();
// Execute contract with voice output
await session.execute({
id: 'task_1',
version: '1.0',
task: 'Explain quantum computing',
});
// Or speak directly
await session.speak('Welcome to the Council');Hive Mode
Create the "collective consciousness" voice effect used in films:
import { HiveModeProcessor, HIVE_PRESETS } from '@codmir/realtime';
const processor = new HiveModeProcessor();
// Apply preset
processor.applyPreset('collective'); // subtle | collective | terrifying | ancient
// Or configure manually
const processor = new HiveModeProcessor({
layers: 5,
offsets: [0, 12, 25, 38, 50], // ms delays
pitchShifts: [0, -3, 2, -5, 4], // semitones
volumes: [1, 0.5, 0.5, 0.3, 0.3],
reverb: 0.4,
chorus: 0.3,
intensity: 0.6,
});
// Process audio
const processed = processor.process(audioBuffer);Hive Presets
| Preset | Description | Use Case |
|--------|-------------|----------|
| subtle | Light layering | Subtle otherworldly feel |
| collective | Multiple voices | AI council, collective |
| terrifying | Heavy effect | Dramatic moments |
| ancient | Deep, reverberant | Ancient entities |
Browser (Web Audio API)
import { HiveModeWebAudio } from '@codmir/realtime';
const audioContext = new AudioContext();
const hive = new HiveModeWebAudio(audioContext);
// Connect to audio source (e.g., from TTS)
const source = audioContext.createBufferSource();
source.connect(hive.getInput());
hive.connectToDestination();
// Adjust in real-time
hive.setIntensity(0.8);
hive.applyPreset('terrifying');Contract Runners
Modal Runner (Serverless)
import { ModalRunner } from '@codmir/realtime';
const runner = new ModalRunner({
apiKey: process.env.MODAL_API_KEY,
appName: 'codmir-realtime',
functionName: 'execute_contract',
gpu: 'T4', // Optional GPU
scaledownWindow: 300, // Keep warm for 5 min
});
const run = await runner.start({
id: 'contract_1',
version: '1.0',
task: 'Generate a poem about AI',
});
// Stream events
for await (const event of runner.stream(run.id)) {
console.log(event.type, event.data);
}Local Runner (Privacy)
import { LocalRunner } from '@codmir/realtime';
const runner = new LocalRunner({
provider: 'ollama',
model: 'llama3.2',
localUrl: 'http://localhost:11434',
});Event Streaming (SSE)
Server (Vercel/Next.js)
// app/api/realtime/stream/route.ts
import { createSSEResponse, ModalRunner } from '@codmir/realtime';
export async function POST(req: Request) {
const { contract } = await req.json();
const runner = new ModalRunner({ apiKey: process.env.MODAL_API_KEY });
const run = await runner.start(contract);
return createSSEResponse(runner.stream(run.id));
}Client
import { SSEClient } from '@codmir/realtime';
const client = new SSEClient({
url: '/api/realtime/stream?runId=run_123',
});
client.connect();
client.on('step.output.delta', (event) => {
console.log('Token:', event.data.delta);
});
client.on('voice.chunk', (event) => {
// Play audio chunk
});TTS Providers
ElevenLabs (Streaming)
import { ElevenLabsTTS } from '@codmir/realtime';
const tts = new ElevenLabsTTS({
apiKey: process.env.ELEVENLABS_API_KEY,
modelId: 'eleven_turbo_v2_5',
optimizeStreamingLatency: 3,
});
// Stream synthesis
for await (const chunk of tts.streamSynthesize('Hello world')) {
// Process audio chunk
}Local TTS (Coqui/Piper)
import { LocalTTS } from '@codmir/realtime';
const tts = new LocalTTS({
provider: 'coqui',
localUrl: 'http://localhost:5002',
});Text Chunking
Optimize latency with smart text segmentation:
import { TextChunker } from '@codmir/realtime';
const chunker = new TextChunker({
strategy: 'adaptive', // sentence | clause | word | time | adaptive
minChars: 20,
maxChars: 200,
maxTimeMs: 500,
});
// Add streaming text
for (const token of tokens) {
const chunks = chunker.add(token);
for (const chunk of chunks) {
await tts.synthesize(chunk.text);
}
}
// Flush remaining
const final = chunker.flush();Voice Modes
import { VoiceModeProcessor } from '@codmir/realtime';
const processor = new VoiceModeProcessor('hive');
// Available modes:
// - normal : Single voice
// - hive : Multi-layered collective
// - whisper : Quiet, intimate
// - entity : Deep, reverberant
// - oracle : Ethereal, wise
// - swarm : Many distinct voices
processor.setMode('entity');
processor.setIntensity(0.8);Transport
WebRTC (Lowest Latency)
import { WebRTCTransport } from '@codmir/realtime';
const transport = new WebRTCTransport({
signalingUrl: '/api/rtc/signal',
iceServers: [{ urls: 'stun:stun.l.google.com:19302' }],
});
await transport.connect();
transport.onAudio((frame) => {
// Play received audio
});WebSocket (Fallback)
import { WebSocketTransport } from '@codmir/realtime';
const transport = new WebSocketTransport({
url: 'wss://api.example.com/realtime',
autoReconnect: true,
});Latency Optimization
- Aggressive chunking - Start speaking before sentence completes
- Streaming TTS - Use ElevenLabs WebSocket API
- Client-side DSP - Do hive processing in browser
- WebRTC transport - Native low-latency audio
- Warm containers - Keep Modal containers warm
Target latency:
- Time-to-first-audio: 300-700ms
- Conversational feel: <1s round-trip
Multi-AI Conference (7 AI Assistants)
Create real-time voice conferences with multiple AI participants using PersonaPlex:
import {
ConferenceRunner,
createCouncilConference,
DEFAULT_COUNCIL,
} from '@codmir/realtime';
// Quick start: Council of 7 AI experts
const session = await createCouncilConference(
'wss://personaplex.example.com:8998',
'The future of artificial intelligence'
);
// Or create custom conference
const runner = new ConferenceRunner({
serverUrl: 'wss://personaplex.example.com:8998',
turnMode: 'free', // free | round_robin | moderated | priority
mixingMode: 'stereo', // stereo | mono | separate
onParticipantAudio: (id, opusData) => playAudio(id, opusData),
onParticipantText: (id, text) => updateTranscript(id, text),
});
await runner.startConference({
participants: [
{
id: 'sage',
name: 'Sage',
voice: 'NATF0',
persona: 'A wise philosopher who considers multiple perspectives',
pan: -0.5,
role: 'expert',
},
{
id: 'analyst',
name: 'Analyst',
voice: 'NATM1',
persona: 'A data scientist focused on facts and evidence',
pan: 0.5,
role: 'expert',
},
// Up to 7 participants
],
topic: 'Climate change solutions',
});
// User speaks - audio goes to all participants
runner.sendAudio(userOpusFrame);
// Control turns
runner.requestSpeaker('sage');
runner.muteParticipant('analyst');
// Get transcripts
const transcripts = runner.getTranscripts();Default Council of 7
Pre-configured diverse AI personas:
| ID | Name | Voice | Role | Personality |
|----|------|-------|------|-------------|
| sage | Sage | NATF0 | Expert | Wise philosopher |
| nova | Nova | NATF2 | Participant | Energetic innovator |
| atlas | Atlas | NATM0 | Expert | Pragmatic analyst |
| oracle | Oracle | VARF1 | Host | Mysterious seer |
| ember | Ember | NATF3 | Participant | Ethics advocate |
| cipher | Cipher | NATM2 | Expert | Technical specialist |
| echo | Echo | VARM1 | Participant | Consensus builder |
React Hook
import { useConference, useCouncilConference } from '@codmir/realtime';
function ConferenceRoom() {
const {
session,
state,
participants,
speakers,
start,
stop,
sendAudio,
transcripts,
isActive,
} = useCouncilConference('wss://personaplex.example.com:8998');
return (
<div>
<button onClick={() => start({ topic: 'AI Ethics' })}>
Start Conference
</button>
{participants.map(p => (
<div key={p.config.id} className={speakers.includes(p.config.id) ? 'speaking' : ''}>
{p.config.name}: {transcripts.get(p.config.id)}
</div>
))}
</div>
);
}Turn Modes
| Mode | Description |
|------|-------------|
| free | All participants can speak anytime |
| round_robin | Cycle through participants in order |
| moderated | Host controls who speaks |
| priority | Higher priority participants speak first |
| reactive | Respond based on what was said |
Future: AWS Migration
// Phase 1: Modal Runner (now)
const runner = new ModalRunner({ ... });
// Phase 2: AWS Runner (future)
import { AWSRunner } from '@codmir/realtime';
const runner = new AWSRunner({
region: 'us-east-1',
functionArn: 'arn:aws:lambda:...',
provisionedConcurrency: 5, // Warm containers
});
// Same interface - swap without client changesLicense
MIT
