@ashwin_droid/airecording
v1.3.1
Published
Recording with CF DOs and R2
Readme
AIRecorder
Real-time audio recording library that streams audio to Cloudflare R2 storage.
Overview
AIRecorder captures real-time audio from the browser microphone and streams it to Cloudflare R2 via WebSocket communication. It provides dual-stream output: raw PCM audio for AI processing and compressed WebM/Opus for efficient storage.
Features
- Real-time Audio Streaming - Capture and stream audio with low latency
- Dual-Stream Architecture - Raw PCM (24kHz) for AI + compressed WebM/Opus for storage
- Network Resilience - Automatic reconnection with 30-minute grace period and buffering
- Pause & Resume - Pause recording for up to 5 hours or pause the whole session and resume with a fresh link
- Cloudflare Integration - Uses Durable Objects for session management and R2 for storage
- FFT Analysis - Built-in frequency analysis for audio visualization
- Backpressure Handling - Buffers data during network interruptions
Installation
npm install @ashwin_droid/airecordingQuick Start
Browser Client
import { createRealtimeRecorderInstance, resumeRealtimeRecorderInstance } from '@ashwin_droid/airecording/client';
// Create a recording session via your worker API
const { wsUrl, sessionId } = await fetch('/api/create-recording-session')
.then(r => r.json());
// Start recording
const recorder = await createRealtimeRecorderInstance(wsUrl, {
chunkMs: 250, // PCM chunk interval (default: 250ms)
compressedChunkMs: 5000, // Compressed chunk interval (default: 5000ms)
includeVideo: false // Add dummy video track (optional)
});
// Listen for network events
recorder.onNetworkError((err) => console.log('Network lost:', err));
recorder.onNetworkRestored(() => console.log('Network restored'));
// Get FFT data for visualization
const freqData = recorder.fft({ buckets: 50 });
// Stop recording
await recorder.finalise();Pause / Resume
// Pause just the recording (socket stays up); auto times out after ~5h
await recorder.pauseRecording();
// Resume recording (reattaches AI if it was enabled)
await recorder.resumeRecording();
// Pause the whole session (intentionally drop WS; 30m resume window)
await recorder.pauseSession();
// Later, on any device, mint a fresh wsUrl from your backend resume endpoint
const { wsUrl } = await fetch('/api/resume-recording-session?sid=...').then(r => r.json());
const { controller, state } = await resumeRealtimeRecorderInstance(wsUrl);
// state includes recordedMs, bytesWritten, fileKey, timestamps, flagsCloudflare Worker
import {
createRealtimeRecordingSession,
handleRealtimeRecordingWebSocket,
RecordingSession
} from '@ashwin_droid/airecording/worker';
export default {
async fetch(request, env) {
const url = new URL(request.url);
// Session initialization
if (url.pathname === '/api/create-recording-session') {
const { wsUrl, sessionId, expectedFileLocation } = await createRealtimeRecordingSession(
'RECORDINGS', // R2 binding name
env.RECORDING_SESSIONS, // Durable Object namespace
request.url,
{
basePrefix: 'sessions',
metadata: { userId: 'user123' },
format: 'webm' // 'webm' or 'mp3'
}
);
return new Response(JSON.stringify({ wsUrl, sessionId, expectedFileLocation }));
}
// WebSocket upgrade
if (url.pathname === '/realtime-recording') {
return handleRealtimeRecordingWebSocket(env.RECORDING_SESSIONS, request);
}
return new Response('Not Found', { status: 404 });
}
};
export { RecordingSession };Wrangler Configuration
[[durable_objects.bindings]]
name = "RECORDING_SESSIONS"
class_name = "RecordingSession"
[[r2_buckets]]
binding = "RECORDINGS"
bucket_name = "my-recordings-bucket"Project Structure
├── index.js # Main entry point
├── package.json
├── client/
│ ├── index.js # Client exports
│ └── realtime-recorder.js # Browser recorder implementation
└── worker/
├── index.js # Worker exports
├── api.js # API handlers
└── recording-session.js # Durable Object session managerModule Exports
Import specific submodules as needed:
// Full library
import { ... } from '@ashwin_droid/airecording';
// Client only
import {
createRealtimeRecorderInstance,
resumeRealtimeRecorderInstance,
} from '@ashwin_droid/airecording/client';
// Worker only
import { createRealtimeRecordingSession, handleRealtimeRecordingWebSocket } from '@ashwin_droid/airecording/worker';
import { RecordingSession } from '@ashwin_droid/airecording/worker/recording-session';Architecture
Dual-Stream Processing
| Stream | Format | Sample Rate | Use Case | |--------|--------|-------------|----------| | PCM | Base64 JSON | 24kHz | AI processing, speech-to-text | | Compressed | WebM/Opus | Native | Storage in R2 |
Network Resilience
- Automatic reconnection on network loss
- 30-minute grace period for incomplete sessions
- Backlog queue buffers data during disconnection
- Automatic flush on network restoration
R2 Upload Strategy
- Multipart upload for large files (>5MB parts)
- Lazy initialization to avoid empty uploads
- Session state persisted in Durable Object storage
- Automatic cleanup on session expiration
API Reference
Client
createRealtimeRecorderInstance(wsUrl, options)
Creates a new recorder instance and returns a controller with:
{ sessionId, wsUrl, fft(), onNetworkError(), onNetworkRestored(), pauseRecording(), resumeRecording(), pauseSession(), finalise() }.
Options:
chunkMs(number) - PCM chunk interval in milliseconds (default: 250)compressedChunkMs(number) - Compressed chunk interval in milliseconds (default: 5000)includeVideo(boolean) - Include dummy video track (default: false)pauseDurationOptimization('conservative' | 'strict' | 'off')
resumeRealtimeRecorderInstance(wsUrl, options)
Reconnects to an existing session (using a fresh WS URL with nonce) and returns
{ controller, state }, where state is a snapshot (recordedMs, bytesWritten, fileKey, timestamps, flags).
Error Codes (surfaced to the client)
INVALID_NONCE— tried to connect with a stale WS URLPAUSE_DURATION_LIMIT_REACHED— recording pause exceeded ~5 hoursSESSION_EXPIRED— session not resumed within ~30 minutes
Worker
createRealtimeRecordingSession(r2BindingName, doNamespace, requestUrl, options)
Initializes a new recording session.
Options:
basePrefix(string) - R2 key prefix for recordingsmetadata(object) - Custom metadata to store with recordingformat(string) - Output format: 'webm' or 'mp3'
Returns: { wsUrl, sessionId, expectedFileLocation }
handleRealtimeRecordingWebSocket(doNamespace, request)
Handles WebSocket upgrade requests for recording sessions.
License
ISC
Repository
https://github.com/Socria-ByteFoundry/AIRecorder
