screencapturekit-audio-capture
v1.3.6
Published
Native Node.js addon for capturing per-application audio on macOS using ScreenCaptureKit framework. Real-time audio streaming with event-based API.
Maintainers
Readme
ScreenCaptureKit Audio Capture
Native Node.js addon for capturing per-application audio on macOS using the ScreenCaptureKit framework
Capture real-time audio from any macOS application with a simple, event-driven API. Built with N-API for Node.js compatibility and ScreenCaptureKit for system-level audio access.
📖 Table of Contents
- Features
- Requirements
- Installation
- Project Structure
- Quick Start
- Quick Integration Guide
- Module Exports
- Testing
- Stream-Based API
- API Reference
- Multi-Process Capture Service
- Events Reference
- TypeScript
- Working with Audio Data
- Resource Lifecycle
- Common Issues
- Examples
- Platform Support
- Performance
- Contributing
- License
Features
- 🎵 Per-App Audio Capture - Isolate audio from specific applications, windows, or displays
- 🎭 Multi-Source Capture - Capture from multiple apps simultaneously with mixed output
- ⚡ Real-Time Streaming - Low-latency callbacks with Event or Stream-based APIs
- 🔄 Multi-Process Service - Server/client architecture for sharing audio across processes
- 📊 Audio Utilities - Built-in RMS/peak/dB analysis and WAV file export
- 📘 TypeScript-First - Full type definitions with memory-safe resource cleanup
Requirements
- macOS 13.0 (Ventura) or later
- Node.js 14.0.0 or later (Node.js 18+ recommended for running the automated test suite)
- Screen Recording permission (granted in System Preferences)
Installation
npm install screencapturekit-audio-capturePrebuilt binaries are included — no compilation or Xcode required for Apple Silicon M series (ARM64) machines.
Fallback Compilation
If no prebuild is available for your architecture, the addon will compile from source automatically. This requires:
- Xcode Command Line Tools (minimum version 14.0)
xcode-select --install - macOS SDK 13.0 or later
The build process links these macOS frameworks:
- ScreenCaptureKit - Per-application audio capture
- AVFoundation - Audio processing
- CoreMedia - Media sample handling
- CoreVideo - Video frame handling
- Foundation - Core Objective-C runtime
All frameworks are part of the macOS system and require no additional installation.
Package Contents
When installed from npm, the package includes:
src/- TypeScript SDK source code and native C++/Objective-C++ codedist/- Compiled JavaScript and TypeScript declarationsbinding.gyp- Native build configurationREADME.md,LICENSE,CHANGELOG.md
Note: Example files are available in the GitHub repository but are not included in the npm package to reduce installation size.
See npm ls screencapturekit-audio-capture for installation location.
Project Structure
screencapturekit-audio-capture/
├── src/ # Source code
│ ├── capture/ # Core audio capture functionality
│ │ ├── index.ts # Barrel exports
│ │ ├── audio-capture.ts # Main AudioCapture class
│ │ └── audio-stream.ts # Readable stream wrapper
│ │
│ ├── native/ # Native C++/Objective-C++ code
│ │ ├── addon.mm # Node.js N-API bindings
│ │ ├── wrapper.h # C++ header
│ │ └── wrapper.mm # ScreenCaptureKit implementation
│ │
│ ├── service/ # Multi-process capture service
│ │ ├── index.ts # Barrel exports
│ │ ├── server.ts # WebSocket server for shared capture
│ │ └── client.ts # WebSocket client
│ │
│ ├── utils/ # Utility modules
│ │ ├── index.ts # Barrel exports
│ │ ├── stt-converter.ts # Speech-to-text transform stream
│ │ └── native-loader.ts # Native addon loader
│ │
│ ├── core/ # Shared types, errors, and lifecycle
│ │ ├── index.ts # Barrel exports
│ │ ├── types.ts # TypeScript type definitions
│ │ ├── errors.ts # Error classes and codes
│ │ └── cleanup.ts # Resource cleanup utilities
│ │
│ └── index.ts # Main package exports
│
├── dist/ # Compiled JavaScript output
├── tests/ # Test suites (unit, integration, edge-cases)
├── readme_examples/ # Runnable example scripts
├── prebuilds/ # Prebuilt native binaries
├── build/ # Native compilation output
│
├── package.json # Package manifest
├── tsconfig.json # TypeScript configuration
├── binding.gyp # Native addon build configuration
├── CHANGELOG.md # Version history
├── LICENSE # MIT License
└── README.md # This fileQuick Start
📁 See
readme_examples/basics/01-quick-start.tsfor runnable code
import { AudioCapture } from 'screencapturekit-audio-capture';
const capture = new AudioCapture();
const app = capture.selectApp(['Spotify', 'Music', 'Safari'], { fallbackToFirst: true });
capture.on('audio', (sample) => {
console.log(`Volume: ${AudioCapture.rmsToDb(sample.rms).toFixed(1)} dB`);
});
capture.startCapture(app.processId);
setTimeout(() => capture.stopCapture(), 10000);Quick Integration Guide
📁 All integration patterns below have runnable examples in
readme_examples/
Common patterns for integrating audio capture into your application:
| Pattern | Example File | Description |
|---------|--------------|-------------|
| STT Integration | voice/02-stt-integration.ts | Stream + event-based approaches for speech-to-text |
| Voice Agent | voice/03-voice-agent.ts | Real-time processing with low-latency config |
| Recording | voice/04-audio-recording.ts | Capture to WAV file with efficient settings |
| Robust Capture | basics/05-robust-capture.ts | Production error handling with fallbacks |
| Multi-App | capture-targets/13-multi-app-capture.ts | Capture game + Discord, Zoom + Music, etc. |
| Multi-Process | advanced/20-capture-service.ts | Share audio across multiple processes |
| Graceful Cleanup | advanced/21-graceful-cleanup.ts | Resource lifecycle and cleanup utilities |
Key Configuration Patterns
For STT engines:
{ format: 'int16', channels: 1, minVolume: 0.01 } // Int16 mono, silence filteredFor low-latency voice processing:
{ format: 'int16', channels: 1, bufferSize: 1024, minVolume: 0.005 }For recording:
{ format: 'int16', channels: 2, bufferSize: 4096 } // Stereo, larger buffer for stabilityAudio Sample Structure
| Property | Type | Description |
|----------|------|-------------|
| data | Buffer | Audio data (Float32 or Int16) |
| sampleRate | number | Sample rate in Hz (e.g., 48000) |
| channels | number | 1 = mono, 2 = stereo |
| format | 'float32' \| 'int16' | Audio format |
| rms | number | RMS volume (0.0-1.0) |
| peak | number | Peak volume (0.0-1.0) |
| timestamp | number | Timestamp in seconds |
| durationMs | number | Duration in milliseconds |
| sampleCount | number | Total samples across all channels |
| framesCount | number | Frames per channel |
Module Exports
import { AudioCapture, AudioCaptureError, ErrorCode } from 'screencapturekit-audio-capture';
import type { AudioSample, ApplicationInfo } from 'screencapturekit-audio-capture';
// Multi-process capture service (for sharing audio across processes)
import { AudioCaptureServer, AudioCaptureClient } from 'screencapturekit-audio-capture';
// Resource cleanup utilities
import { cleanupAll, getActiveInstanceCount, installGracefulShutdown } from 'screencapturekit-audio-capture';| Export | Description |
|--------|-------------|
| AudioCapture | High-level event-based API (recommended) |
| AudioStream | Readable stream (via createAudioStream()) |
| STTConverter | Transform stream for STT (via createSTTStream()) |
| AudioCaptureServer | WebSocket server for shared capture (multi-process) |
| AudioCaptureClient | WebSocket client to receive shared audio |
| AudioCaptureError | Error class with codes and details |
| ErrorCode | Error code enum for type-safe handling |
| cleanupAll | Dispose all AudioCapture and AudioCaptureServer instances |
| getActiveInstanceCount | Get total active instance count |
| installGracefulShutdown | Install process exit handlers for cleanup |
| ScreenCaptureKit | Low-level native binding (advanced) |
Types: AudioSample, ApplicationInfo, WindowInfo, DisplayInfo, CaptureOptions, PermissionStatus, ActivityInfo, ServerOptions, ClientOptions, and more.
Testing
Note: Test files are available in the GitHub repository but are not included in the npm package.
Tests are written in TypeScript and live under tests/. They use Node's built-in test runner with tsx (Node 18+).
Test Commands:
npm test— Runs every suite intests/**/*.test.ts(unit, integration, edge-cases) against the mocked ScreenCaptureKit layer; works cross-platform.npm run test:unit— Fast coverage for utilities, audio metrics, selection, and capture control.npm run test:integration— Multi-component flows (window/display capture, activity tracking, capability guards) using the shared mock.npm run test:edge-cases— Boundary/error handling coverage.
Type Checking:
npm run typecheck— Type-check the SDK source code.npm run typecheck:tests— Type-check the test files.
For true hardware validation, run the example scripts on macOS with Screen Recording permission enabled.
Stream-Based API
📁 See
readme_examples/streams/06-stream-basics.tsandreadme_examples/streams/07-stream-processing.tsfor runnable examples
Use Node.js Readable streams for composable audio processing:
const audioStream = capture.createAudioStream('Spotify', { minVolume: 0.01 });
audioStream.pipe(yourWritableStream);
// Object mode for metadata access
const metaStream = capture.createAudioStream('Spotify', { objectMode: true });
metaStream.on('data', (sample) => console.log(`RMS: ${sample.rms}`));When to Use Streams vs Events
| Use Case | Recommended API | |----------|----------------| | Piping through transforms | Stream | | Backpressure handling | Stream | | Multiple listeners | Event | | Maximum simplicity | Event |
Both APIs use the same underlying capture mechanism and have identical performance.
Stream API Best Practices
- Always handle errors - Attach an
errorhandler to prevent crashes - Use
pipeline()- Better error handling than chaining.pipe() - Clean up resources - Call
stream.stop()when done - Choose the right mode - Normal mode for raw data, object mode for metadata
- Stream must flow - Attach a
datalistener to start capture
import { pipeline } from 'stream';
// Recommended pattern
pipeline(audioStream, transform, writable, (err) => {
if (err) console.error('Pipeline failed:', err);
});
// Always handle SIGINT
process.on('SIGINT', () => audioStream.stop());Troubleshooting Stream Issues
| Issue | Cause | Solution |
|-------|-------|----------|
| "Application not found" | App not running | Use selectApp() with fallbacks |
| No data events | App not playing audio / minVolume too high | Verify app is playing; lower or remove threshold |
| "stream.push() after EOF" | Stopping abruptly | Use pipeline() for proper cleanup |
| "Already capturing" | Multiple streams from one instance | Create separate AudioCapture instances |
| Memory growing | Not consuming data | Attach data listener; use circular buffer |
Stream Performance Tips
- Normal mode is faster than object mode (no metadata calculation)
- Batch processing is more efficient than per-sample processing
- Default highWaterMark is suitable for most cases
📁 See
readme_examples/streams/07-stream-processing.tsfor a complete production-ready stream example
API Reference
Class: AudioCapture
High-level event-based API (recommended).
Methods Overview
| # | Category | Method | Description |
|---|----------|--------|-------------|
| | Discovery | | |
| 1 | | getApplications(opts?) | List all capturable apps |
| 2 | | getAudioApps(opts?) | List apps likely to produce audio |
| 3 | | findApplication(id) | Find app by name or bundle ID |
| 4 | | findByName(name) | Alias for findApplication() |
| 5 | | getApplicationByPid(pid) | Find app by process ID |
| 6 | | getWindows(opts?) | List all capturable windows |
| 7 | | getDisplays() | List all displays |
| | Selection | | |
| 8 | | selectApp(ids?, opts?) | Smart app selection with fallbacks |
| | Capture | | |
| 9 | | startCapture(app, opts?) | Start capturing from an app |
| 10 | | captureWindow(id, opts?) | Capture from a specific window |
| 11 | | captureDisplay(id, opts?) | Capture from a display |
| 12 | | captureMultipleApps(ids, opts?) | Capture multiple apps (mixed) |
| 13 | | captureMultipleWindows(ids, opts?) | Capture multiple windows (mixed) |
| 14 | | captureMultipleDisplays(ids, opts?) | Capture multiple displays (mixed) |
| 15 | | stopCapture() | Stop current capture |
| 16 | | isCapturing() | Check if currently capturing |
| 17 | | getStatus() | Get detailed capture status |
| 18 | | getCurrentCapture() | Get current capture target info |
| | Streams | | |
| 19 | | createAudioStream(app, opts?) | Create Node.js Readable stream |
| 20 | | createSTTStream(app?, opts?) | Stream pre-configured for STT |
| | Activity | | |
| 21 | | enableActivityTracking(opts?) | Track which apps produce audio |
| 22 | | disableActivityTracking() | Stop tracking and clear cache |
| 23 | | getActivityInfo() | Get tracking stats |
| | Lifecycle | | |
| 24 | | dispose() | Release resources and stop capture |
| 25 | | isDisposed() | Check if instance is disposed |
Static Methods
| # | Method | Description |
|---|--------|-------------|
| S1 | AudioCapture.verifyPermissions() | Check screen recording permission |
| S2 | AudioCapture.bufferToFloat32Array(buf) | Convert Buffer to Float32Array |
| S3 | AudioCapture.rmsToDb(rms) | Convert RMS (0-1) to decibels |
| S4 | AudioCapture.peakToDb(peak) | Convert peak (0-1) to decibels |
| S5 | AudioCapture.calculateDb(buf, method?) | Calculate dB from audio buffer |
| S6 | AudioCapture.writeWav(buf, opts) | Create WAV file from PCM data |
| S7 | AudioCapture.cleanupAll() | Dispose all active instances |
| S8 | AudioCapture.getActiveInstanceCount() | Get number of active instances |
Events
| Event | Payload | Description |
|-------|---------|-------------|
| 'start' | CaptureInfo | Capture started |
| 'audio' | AudioSample | Audio data received |
| 'stop' | CaptureInfo | Capture stopped |
| 'error' | AudioCaptureError | Error occurred |
Method Reference
Discovery Methods
[1] getApplications(options?): ApplicationInfo[]
List all capturable applications.
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| includeEmpty | boolean | false | Include apps with empty names (helpers, background services) |
const apps = capture.getApplications();
const allApps = capture.getApplications({ includeEmpty: true });[2] getAudioApps(options?): ApplicationInfo[]
List apps likely to produce audio. Filters system apps, utilities, and background processes.
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| includeSystemApps | boolean | false | Include system apps (Finder, etc.) |
| includeEmpty | boolean | false | Include apps with empty names |
| sortByActivity | boolean | false | Sort by recent audio activity (requires [21]) |
| appList | Array | null | Reuse prefetched app list |
const audioApps = capture.getAudioApps();
// Returns: ['Spotify', 'Safari', 'Music', 'Zoom']
// Excludes: Finder, Terminal, System Preferences, etc.
// Sort by activity (most active first)
capture.enableActivityTracking();
const sorted = capture.getAudioApps({ sortByActivity: true });[3] findApplication(identifier): ApplicationInfo | null
Find app by name or bundle ID (case-insensitive, partial match).
| Parameter | Type | Description |
|-----------|------|-------------|
| identifier | string | App name or bundle ID |
const spotify = capture.findApplication('Spotify');
const safari = capture.findApplication('com.apple.Safari');
const partial = capture.findApplication('spot'); // Matches "Spotify"[4] findByName(name): ApplicationInfo | null
Alias for findApplication(). Provided for semantic clarity.
[5] getApplicationByPid(processId): ApplicationInfo | null
Find app by process ID.
| Parameter | Type | Description |
|-----------|------|-------------|
| processId | number | Process ID |
const app = capture.getApplicationByPid(12345);[6] getWindows(options?): WindowInfo[]
List all capturable windows.
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| onScreenOnly | boolean | false | Only include visible windows |
| requireTitle | boolean | false | Only include windows with titles |
| processId | number | - | Filter by owning process ID |
Returns WindowInfo:
windowId: Unique window identifiertitle: Window titleowningProcessId: PID of owning appowningApplicationName: App nameowningBundleIdentifier: Bundle IDframe:{ x, y, width, height }layer: Window layer levelonScreen: Whether visibleactive: Whether active
const windows = capture.getWindows({ onScreenOnly: true, requireTitle: true });
windows.forEach(w => console.log(`${w.windowId}: ${w.title} (${w.owningApplicationName})`));[7] getDisplays(): DisplayInfo[]
List all displays.
Returns DisplayInfo:
displayId: Unique display identifierwidth: Display width in pixelsheight: Display height in pixelsframe:{ x, y, width, height }isMainDisplay: Whether this is the primary display
const displays = capture.getDisplays();
const main = displays.find(d => d.isMainDisplay);Selection Method
[8] selectApp(identifiers?, options?): ApplicationInfo | null
Smart app selection with multiple fallback strategies.
| Parameter | Type | Description |
|-----------|------|-------------|
| identifiers | string | number | Array | null | App name, PID, bundle ID, or array to try in order |
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| audioOnly | boolean | true | Only search audio apps |
| fallbackToFirst | boolean | false | Return first app if no match |
| throwOnNotFound | boolean | false | Throw error instead of returning null |
| sortByActivity | boolean | false | Sort by recent activity (requires [21]) |
| appList | Array | null | Reuse prefetched app list |
// Try multiple apps in order
const app = capture.selectApp(['Spotify', 'Music', 'Safari']);
// Get first audio app
const first = capture.selectApp();
// Fallback to first if none match
const fallback = capture.selectApp(['Spotify'], { fallbackToFirst: true });
// Throw on failure
try {
const app = capture.selectApp(['Spotify'], { throwOnNotFound: true });
} catch (err) {
console.log('Not found:', err.details.availableApps);
}Capture Methods
All capture methods accept CaptureOptions:
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| format | 'float32' | 'int16' | 'float32' | Audio format |
| channels | 1 | 2 | 2 | Mono or stereo |
| sampleRate | number | 48000 | Requested sample rate (system-dependent) |
| bufferSize | number | system | Buffer size in frames (affects latency) |
| minVolume | number | 0 | Min RMS threshold (0-1), filters silence |
| excludeCursor | boolean | true | Reserved for future video features |
Buffer Size Guidelines:
1024: ~21ms latency, higher CPU2048: ~43ms latency, balanced (recommended)4096: ~85ms latency, lower CPU
[9] startCapture(appIdentifier, options?): boolean
Start capturing from an application.
| Parameter | Type | Description |
|-----------|------|-------------|
| appIdentifier | string | number | ApplicationInfo | App name, bundle ID, PID, or app object |
capture.startCapture('Spotify'); // By name
capture.startCapture('com.spotify.client'); // By bundle ID
capture.startCapture(12345); // By PID
capture.startCapture(app); // By object
// With options
capture.startCapture('Spotify', {
format: 'int16',
channels: 1,
minVolume: 0.01
});[10] captureWindow(windowId, options?): boolean
Capture audio from a specific window.
| Parameter | Type | Description |
|-----------|------|-------------|
| windowId | number | Window ID from getWindows() |
const windows = capture.getWindows({ requireTitle: true });
const target = windows.find(w => w.title.includes('Safari'));
capture.captureWindow(target.windowId, { format: 'int16' });[11] captureDisplay(displayId, options?): boolean
Capture audio from a display.
| Parameter | Type | Description |
|-----------|------|-------------|
| displayId | number | Display ID from getDisplays() |
const displays = capture.getDisplays();
const main = displays.find(d => d.isMainDisplay);
capture.captureDisplay(main.displayId);[12] captureMultipleApps(appIdentifiers, options?): boolean
Capture from multiple apps simultaneously. Audio is mixed into a single stream.
| Parameter | Type | Description |
|-----------|------|-------------|
| appIdentifiers | Array | App names, PIDs, bundle IDs, or ApplicationInfo objects |
| Additional Option | Type | Default | Description |
|-------------------|------|---------|-------------|
| allowPartial | boolean | false | Continue if some apps not found |
// Capture game + Discord audio
capture.captureMultipleApps(['Minecraft', 'Discord'], {
allowPartial: true, // Continue even if one app not found
format: 'int16'
});[13] captureMultipleWindows(windowIdentifiers, options?): boolean
Capture from multiple windows. Audio is mixed.
| Parameter | Type | Description |
|-----------|------|-------------|
| windowIdentifiers | Array | Window IDs or WindowInfo objects |
| Additional Option | Type | Default | Description |
|-------------------|------|---------|-------------|
| allowPartial | boolean | false | Continue if some windows not found |
const windows = capture.getWindows({ requireTitle: true });
const browserWindows = windows.filter(w => /Safari|Chrome/.test(w.owningApplicationName));
capture.captureMultipleWindows(browserWindows.map(w => w.windowId));[14] captureMultipleDisplays(displayIdentifiers, options?): boolean
Capture from multiple displays. Audio is mixed.
| Parameter | Type | Description |
|-----------|------|-------------|
| displayIdentifiers | Array | Display IDs or DisplayInfo objects |
| Additional Option | Type | Default | Description |
|-------------------|------|---------|-------------|
| allowPartial | boolean | false | Continue if some displays not found |
const displays = capture.getDisplays();
capture.captureMultipleDisplays(displays.map(d => d.displayId));[15] stopCapture(): void
Stop the current capture session. Emits 'stop' event.
[16] isCapturing(): boolean
Check if currently capturing.
if (capture.isCapturing()) {
capture.stopCapture();
}[17] getStatus(): CaptureStatus | null
Get detailed capture status. Returns null if not capturing.
Returns CaptureStatus:
capturing: Alwaystruewhen not nullprocessId: Process ID (may be null for display capture)app: ApplicationInfo or nullwindow: WindowInfo or nulldisplay: DisplayInfo or nulltargetType:'application'|'window'|'display'|'multi-app'config:{ minVolume, format }
const status = capture.getStatus();
if (status) {
console.log(`Type: ${status.targetType}, App: ${status.app?.applicationName}`);
}[18] getCurrentCapture(): CaptureInfo | null
Get current capture target info. Same as getStatus() but without config.
Stream Methods
[19] createAudioStream(appIdentifier, options?): AudioStream
Create a Node.js Readable stream for audio capture.
| Parameter | Type | Description |
|-----------|------|-------------|
| appIdentifier | string | number | App name, bundle ID, or PID |
| Additional Option | Type | Default | Description |
|-------------------|------|---------|-------------|
| objectMode | boolean | false | Emit AudioSample objects instead of Buffers |
// Raw buffer mode (for piping)
const stream = capture.createAudioStream('Spotify');
stream.pipe(myWritable);
// Object mode (for metadata access)
const stream = capture.createAudioStream('Spotify', { objectMode: true });
stream.on('data', (sample) => console.log(`RMS: ${sample.rms}`));
// Stop stream
stream.stop();[20] createSTTStream(appIdentifier?, options?): STTConverter
Create stream pre-configured for Speech-to-Text engines.
| Parameter | Type | Description |
|-----------|------|-------------|
| appIdentifier | string | number | Array | null | App identifier(s), null for auto-select |
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| format | 'int16' | 'float32' | 'int16' | Output format |
| channels | 1 | 2 | 1 | Output channels (mono recommended) |
| objectMode | boolean | false | Emit objects with metadata |
| autoSelect | boolean | true | Auto-select first audio app if not found |
| minVolume | number | - | Silence filter threshold |
// Auto-selects first audio app, converts to Int16 mono
const sttStream = capture.createSTTStream();
sttStream.pipe(yourSTTEngine);
// With fallback apps
const sttStream = capture.createSTTStream(['Zoom', 'Safari', 'Chrome']);
// Access selected app
console.log(`Selected: ${sttStream.app.applicationName}`);
// Stop
sttStream.stop();Activity Tracking Methods
[21] enableActivityTracking(options?): void
Enable background tracking of audio activity. Useful for sorting apps by recent audio.
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| decayMs | number | 30000 | Remove apps from cache after this many ms of inactivity |
capture.enableActivityTracking({ decayMs: 60000 }); // 60s decay[22] disableActivityTracking(): void
Disable tracking and clear the cache.
capture.disableActivityTracking();[23] getActivityInfo(): ActivityInfo
Get activity tracking status and statistics.
Returns ActivityInfo:
enabled: Whether tracking is enabledtrackedApps: Number of apps in cacherecentApps: Array ofProcessActivityInfo:processId: Process IDlastSeen: Timestamp of last audioageMs: Time since last audioavgRMS: Average RMS levelsampleCount: Number of samples received
const info = capture.getActivityInfo();
console.log(`Active apps: ${info.trackedApps}`);
info.recentApps.forEach(app => {
console.log(`PID ${app.processId}: ${app.sampleCount} samples`);
});Lifecycle Methods
[24] dispose(): void
Release all resources and stop any active capture. Safe to call multiple times (idempotent).
const capture = new AudioCapture();
capture.startCapture('Spotify');
// When done, release resources
capture.dispose();
// Instance can no longer be used
console.log(capture.isDisposed()); // true[25] isDisposed(): boolean
Check if this instance has been disposed.
if (!capture.isDisposed()) {
capture.startCapture('Spotify');
}Note: Calling methods like
startCapture(),captureWindow(), orcaptureDisplay()on a disposed instance will throw an error.
Static Method Reference
[S1] AudioCapture.verifyPermissions(): PermissionStatus
Check screen recording permission before capture.
Returns PermissionStatus:
granted: Whether permission is grantedmessage: Human-readable statusapps: Prefetched app list (reuse withselectApp({ appList }))availableApps: Number of apps foundremediation: Fix instructions (if not granted)
const status = AudioCapture.verifyPermissions();
if (!status.granted) {
console.error(status.message);
console.log(status.remediation);
process.exit(1);
}
// Reuse apps list
const app = capture.selectApp(['Spotify'], { appList: status.apps });[S2] AudioCapture.bufferToFloat32Array(buffer): Float32Array
Convert Buffer to Float32Array for audio processing.
capture.on('audio', (sample) => {
const floats = AudioCapture.bufferToFloat32Array(sample.data);
// Process individual samples
for (let i = 0; i < floats.length; i++) {
const value = floats[i]; // Range: -1.0 to 1.0
}
});[S3] AudioCapture.rmsToDb(rms): number
Convert RMS value (0-1) to decibels.
const db = AudioCapture.rmsToDb(0.5); // -6.02 dB
const db = AudioCapture.rmsToDb(sample.rms);[S4] AudioCapture.peakToDb(peak): number
Convert peak value (0-1) to decibels.
const db = AudioCapture.peakToDb(sample.peak);[S5] AudioCapture.calculateDb(buffer, method?): number
Calculate dB level directly from audio buffer.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| buffer | Buffer | - | Audio data buffer |
| method | 'rms' | 'peak' | 'rms' | Calculation method |
capture.on('audio', (sample) => {
const rmsDb = AudioCapture.calculateDb(sample.data, 'rms');
const peakDb = AudioCapture.calculateDb(sample.data, 'peak');
});[S6] AudioCapture.writeWav(buffer, options): Buffer
Create a complete WAV file from PCM audio data.
| Option | Type | Required | Description |
|--------|------|----------|-------------|
| sampleRate | number | ✓ | Sample rate in Hz |
| channels | number | ✓ | Number of channels |
| format | 'float32' | 'int16' | | Audio format (default: 'float32') |
import fs from 'fs';
capture.on('audio', (sample) => {
const wav = AudioCapture.writeWav(sample.data, {
sampleRate: sample.sampleRate,
channels: sample.channels,
format: sample.format
});
fs.writeFileSync('output.wav', wav);
});[S7] AudioCapture.cleanupAll(): number
Dispose all active AudioCapture instances. Returns the number of instances cleaned up.
// Create multiple instances
const capture1 = new AudioCapture();
const capture2 = new AudioCapture();
console.log(AudioCapture.getActiveInstanceCount()); // 2
// Clean up all at once
const cleaned = AudioCapture.cleanupAll();
console.log(`Cleaned up ${cleaned} instances`); // 2[S8] AudioCapture.getActiveInstanceCount(): number
Get the number of active (non-disposed) AudioCapture instances.
const capture = new AudioCapture();
console.log(AudioCapture.getActiveInstanceCount()); // 1
capture.dispose();
console.log(AudioCapture.getActiveInstanceCount()); // 0Error Handling
Class: AudioCaptureError
Custom error class thrown by the SDK.
message: Human-readable error messagecode: Machine-readable error code (see below)details: Additional context (e.g.,processId,availableApps)
Error Codes
Import ErrorCode for reliable error checking:
import { AudioCapture, AudioCaptureError, ErrorCode } from 'screencapturekit-audio-capture';
const capture = new AudioCapture();
capture.on('error', (err: AudioCaptureError) => {
if (err.code === ErrorCode.APP_NOT_FOUND) {
// Handle missing app
}
});| Code | Description |
|------|-------------|
| ERR_PERMISSION_DENIED | Screen Recording permission not granted |
| ERR_APP_NOT_FOUND | Application not found by name or bundle ID |
| ERR_PROCESS_NOT_FOUND | Process ID not found or not running |
| ERR_ALREADY_CAPTURING | Attempted to start capture while already capturing |
| ERR_CAPTURE_FAILED | Native capture failed to start (e.g., app has no windows) |
| ERR_INVALID_ARGUMENT | Invalid arguments provided to method |
Using Error Codes:
import { AudioCapture, AudioCaptureError, ErrorCode } from 'screencapturekit-audio-capture';
const capture = new AudioCapture();
capture.on('error', (err: AudioCaptureError) => {
switch (err.code) {
case ErrorCode.PERMISSION_DENIED:
console.log('Grant Screen Recording permission');
break;
case ErrorCode.APP_NOT_FOUND:
console.log('App not found:', err.details.requestedApp);
console.log('Available:', err.details.availableApps);
break;
case ErrorCode.ALREADY_CAPTURING:
console.log('Stop current capture first');
capture.stopCapture();
break;
default:
console.error('Error:', err.message);
}
});Stream Classes
AudioStream - Readable stream extending Node.js Readable:
stop()- Stop stream and capturegetCurrentCapture()- Get current capture info
STTConverter - Transform stream extending Node.js Transform:
stop()- Stop stream and captureapp- The selected ApplicationInfocaptureOptions- Options used for capture
Low-Level API: ScreenCaptureKit
For advanced users who need direct access to the native binding:
import { ScreenCaptureKit } from 'screencapturekit-audio-capture';
const captureKit = new ScreenCaptureKit();
// Get apps (returns basic ApplicationInfo array)
const apps = captureKit.getAvailableApps();
// Start capture (requires manual callback handling)
captureKit.startCapture(processId, config, (sample) => {
// sample: { data, sampleRate, channelCount, timestamp }
// No enhancement - raw native data
});
captureKit.stopCapture();
const isCapturing = captureKit.isCapturing();When to use:
- Absolute minimal overhead needed
- Building your own wrapper
- Avoiding event emitter overhead
Most users should use AudioCapture instead.
Multi-Process Capture Service
macOS ScreenCaptureKit only allows one process to capture audio at a time. If you need multiple processes to receive the same audio data, use the server/client architecture.
📁 See
readme_examples/advanced/20-capture-service.tsfor a complete example
When to Use
| Scenario | Solution |
|----------|----------|
| Single app capturing audio | Use AudioCapture directly |
| Multiple processes need same audio | Use AudioCaptureServer + AudioCaptureClient |
| Electron main + renderer processes | Use server/client |
| Microservices architecture | Use server/client |
Architecture Overview
┌─────────────────────────────────────────────────────────┐
│ AudioCaptureServer │
│ - Runs in one process │
│ - Handles actual ScreenCaptureKit capture │
│ - Broadcasts audio to all connected clients │
└─────────────────────────────────────────────────────────┘
│ WebSocket (ws://localhost:9123)
▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Client 1 │ │ Client 2 │ │ Client N │
│ (Process A)│ │ (Process B)│ │ (Process N)│
└─────────────┘ └─────────────┘ └─────────────┘Server Usage
import { AudioCaptureServer } from 'screencapturekit-audio-capture';
const server = new AudioCaptureServer({
port: 9123, // Default: 9123
host: 'localhost' // Default: 'localhost'
});
// Start the server
await server.start();
// Server events
server.on('clientConnected', (clientId) => console.log(`Client ${clientId} connected`));
server.on('clientDisconnected', (clientId) => console.log(`Client ${clientId} disconnected`));
server.on('captureStarted', (session) => console.log(`Capture started: ${session.id}`));
server.on('captureStopped', () => console.log('Capture stopped'));
server.on('captureError', (error) => console.error('Capture error:', error));
// Stop the server
await server.stop();Server Methods
| Method | Returns | Description |
|--------|---------|-------------|
| start() | Promise<void> | Start the WebSocket server |
| stop() | Promise<void> | Stop server and disconnect all clients |
| getSession() | CaptureSession \| null | Get current capture session info |
| getClientCount() | number | Get number of connected clients |
Server Events
| Event | Payload | Description |
|-------|---------|-------------|
| 'clientConnected' | clientId: string | Client connected |
| 'clientDisconnected' | clientId: string | Client disconnected |
| 'captureStarted' | CaptureSession | Capture session started |
| 'captureStopped' | - | Capture session stopped |
| 'captureError' | Error | Capture error occurred |
Client Usage
import { AudioCaptureClient } from 'screencapturekit-audio-capture';
const client = new AudioCaptureClient({
url: 'ws://localhost:9123', // Default
autoReconnect: true, // Default: true
reconnectDelay: 1000, // Default: 1000ms
maxReconnectAttempts: 10 // Default: 10
});
// Connect to server
await client.connect();
// Receive audio (similar API to AudioCapture)
client.on('audio', (sample) => {
console.log(`Received ${sample.data.length} samples at ${sample.sampleRate}Hz`);
});
// List available apps (via server)
const apps = await client.getApplications();
// Start capture (request sent to server)
await client.startCapture('Spotify');
// Or by PID: await client.startCapture(12345);
// Other capture methods
await client.captureWindow(windowId);
await client.captureDisplay(displayId);
await client.captureMultipleApps(['Spotify', 'Discord']);
// Get server status
const status = await client.getStatus();
console.log(`Capturing: ${status.capturing}, Clients: ${status.totalClients}`);
// Stop capture
await client.stopCapture();
// Disconnect
client.disconnect();Client Methods
| Method | Returns | Description |
|--------|---------|-------------|
| connect() | Promise<void> | Connect to the server |
| disconnect() | void | Disconnect from server |
| getApplications() | Promise<ApplicationInfo[]> | List apps via server |
| getWindows() | Promise<WindowInfo[]> | List windows via server |
| getDisplays() | Promise<DisplayInfo[]> | List displays via server |
| startCapture(target, opts?) | Promise<boolean> | Start app capture |
| captureWindow(id, opts?) | Promise<boolean> | Start window capture |
| captureDisplay(id, opts?) | Promise<boolean> | Start display capture |
| captureMultipleApps(targets, opts?) | Promise<boolean> | Start multi-app capture |
| stopCapture() | Promise<void> | Stop current capture |
| getStatus() | Promise<ServerStatus> | Get server status |
| getClientId() | string \| null | Get this client's ID |
| getSessionId() | string \| null | Get current session ID |
Client Events
| Event | Payload | Description |
|-------|---------|-------------|
| 'connected' | - | Connected to server |
| 'disconnected' | - | Disconnected from server |
| 'reconnecting' | attempt: number | Attempting to reconnect |
| 'reconnectFailed' | - | Max reconnect attempts reached |
| 'audio' | RemoteAudioSample | Audio data received |
| 'captureStopped' | - | Server stopped capture |
| 'captureError' | { message: string } | Server capture error |
| 'error' | Error | Client-side error |
RemoteAudioSample
Audio samples received by clients have this structure:
| Property | Type | Description |
|----------|------|-------------|
| data | Float32Array | Audio sample data |
| sampleRate | number | Sample rate in Hz |
| channels | number | Number of channels |
| timestamp | number | Timestamp in seconds |
Events Reference
Event: 'start'
Emitted when capture starts.
capture.on('start', ({ processId, app }) => {
console.log(`Capturing from ${app?.applicationName}`);
});Event: 'audio'
Emitted for each audio sample. See Audio Sample Structure for all properties.
capture.on('audio', (sample: AudioSample) => {
console.log(`${sample.durationMs}ms, RMS: ${sample.rms}`);
});Event: 'stop'
Emitted when capture stops.
capture.on('stop', ({ processId }) => {
console.log('Capture stopped');
});Event: 'error'
Emitted on errors.
capture.on('error', (err: AudioCaptureError) => {
console.error(`[${err.code}]:`, err.message);
});TypeScript
Full type definitions included. See Module Exports for import syntax.
Available Types
| Type | Description |
|------|-------------|
| AudioSample | Audio sample with data and metadata |
| ApplicationInfo | App info (processId, bundleIdentifier, applicationName) |
| WindowInfo | Window info (windowId, title, frame, etc.) |
| DisplayInfo | Display info (displayId, width, height, etc.) |
| CaptureInfo | Current capture target info |
| CaptureStatus | Full capture status including config |
| PermissionStatus | Permission verification result |
| ActivityInfo | Activity tracking stats |
| CaptureOptions | Options for startCapture() |
| AudioStreamOptions | Options for createAudioStream() |
| STTStreamOptions | Options for createSTTStream() |
| MultiAppCaptureOptions | Options for captureMultipleApps() |
| MultiWindowCaptureOptions | Options for captureMultipleWindows() |
| MultiDisplayCaptureOptions | Options for captureMultipleDisplays() |
| ServerOptions | Options for AudioCaptureServer |
| ClientOptions | Options for AudioCaptureClient |
| RemoteAudioSample | Audio sample received via client |
| CleanupResult | Result of cleanupAll() operation |
| ErrorCode | Enum of error codes |
Working with Audio Data
Buffer Format
Audio samples are Node.js Buffer objects containing Float32 PCM by default:
capture.on('audio', (sample) => {
// Use helper (recommended)
const float32 = AudioCapture.bufferToFloat32Array(sample.data);
// Or manual
const float32Manual = new Float32Array(
sample.data.buffer,
sample.data.byteOffset,
sample.data.byteLength / 4
);
});Int16 Format
capture.startCapture('Spotify', { format: 'int16' });
capture.on('audio', (sample) => {
const int16 = new Int16Array(
sample.data.buffer,
sample.data.byteOffset,
sample.data.byteLength / 2
);
});Filtering Silence
capture.startCapture('Spotify', { minVolume: 0.01 });
// Only emits audio events when volume > 0.01 RMSResource Lifecycle
📁 See
readme_examples/advanced/21-graceful-cleanup.tsfor a complete example
Properly managing resources ensures your application shuts down cleanly without orphaned captures or memory leaks.
Instance Cleanup
const capture = new AudioCapture();
capture.startCapture('Spotify');
// When done with this specific instance
capture.dispose(); // Stops capture and releases resourcesGlobal Cleanup
import { cleanupAll, getActiveInstanceCount, installGracefulShutdown } from 'screencapturekit-audio-capture';
// Check active instances
console.log(`Active: ${getActiveInstanceCount()}`);
// Clean up all instances at once
const result = await cleanupAll(); // Returns CleanupResult
console.log(`Cleaned up ${result.total} instances`);
// Install automatic cleanup on process exit (SIGINT, SIGTERM, etc.)
installGracefulShutdown();Best Practices
| Pattern | When to Use |
|---------|-------------|
| capture.dispose() | Cleaning up a specific instance |
| AudioCapture.cleanupAll() | Cleaning up all AudioCapture instances |
| cleanupAll() | Cleaning up all instances (AudioCapture + AudioCaptureServer) |
| installGracefulShutdown() | Auto-cleanup on Ctrl+C, kill signals, or uncaught exceptions |
Process Exit Handling
Exit handlers are automatically installed when you create an AudioCapture or AudioCaptureServer instance. For explicit control:
import { installGracefulShutdown } from 'screencapturekit-audio-capture';
// Install once at application startup
installGracefulShutdown();
// Now SIGINT/SIGTERM will automatically:
// 1. Stop all active captures
// 2. Dispose all instances
// 3. Exit cleanlyCommon Issues
No applications available
Solution: Grant Screen Recording permission in System Preferences → Privacy & Security → Screen Recording, then restart your terminal.
Application not found
Solutions:
- Check if the app is running
- Use
capture.getApplications()to list available apps - Use bundle ID instead of name:
capture.startCapture('com.spotify.client')
No audio samples received
Solutions:
- Ensure the app is playing audio
- Check if audio is muted
- Remove
minVolumethreshold for testing - Verify the app has visible windows
Build errors
Note: Most users won't see build errors since prebuilt binaries are included. These steps apply only if compilation is needed.
Solutions:
- Install Xcode CLI Tools:
xcode-select --install - Verify macOS 13.0+:
sw_vers - Clean rebuild:
npm run clean && npm run build
Examples
📁 All examples are in
readme_examples/
Basics
| Example | File | Description |
|---------|------|-------------|
| Quick Start | basics/01-quick-start.ts | Basic capture setup |
| Robust Capture | basics/05-robust-capture.ts | Production error handling |
| Find Apps | basics/11-find-apps.ts | App discovery |
Voice & STT
| Example | File | Description |
|---------|------|-------------|
| STT Integration | voice/02-stt-integration.ts | Speech-to-text patterns |
| Voice Agent | voice/03-voice-agent.ts | Real-time voice processing |
| Recording | voice/04-audio-recording.ts | Record and save as WAV |
Streams
| Example | File | Description |
|---------|------|-------------|
| Stream Basics | streams/06-stream-basics.ts | Stream API fundamentals |
| Stream Processing | streams/07-stream-processing.ts | Transform streams |
Processing
| Example | File | Description |
|---------|------|-------------|
| Visualizer | processing/08-visualizer.ts | ASCII volume display |
| Volume Monitor | processing/09-volume-monitor.ts | Level alerts |
| Int16 Capture | processing/10-int16-capture.ts | Int16 format |
| Manual Processing | processing/12-manual-processing.ts | Buffer manipulation |
Capture Targets
| Example | File | Description |
|---------|------|-------------|
| Multi-App Capture | capture-targets/13-multi-app-capture.ts | Multiple apps |
| Per-App Streams | capture-targets/14-per-app-streams.ts | Separate streams |
| Window Capture | capture-targets/15-window-capture.ts | Single window |
| Display Capture | capture-targets/16-display-capture.ts | Full display |
| Multi-Window | capture-targets/17-multi-window-capture.ts | Multiple windows |
| Multi-Display | capture-targets/18-multi-display-capture.ts | Multiple displays |
Advanced
| Example | File | Description |
|---------|------|-------------|
| Advanced Methods | advanced/19-advanced-methods.ts | Activity tracking |
| Capture Service | advanced/20-capture-service.ts | Multi-process sharing |
| Graceful Cleanup | advanced/21-graceful-cleanup.ts | Resource lifecycle management |
Run examples:
npx tsx readme_examples/basics/01-quick-start.ts
npm run test:readme # Run all examplesTargeting specific apps/windows/displays:
Most examples support environment variables to target specific sources instead of using defaults:
| Env Variable | Type | Used By | Example |
|-------------|------|---------|---------|
| TARGET_APP | App name | 01-12, 19-21 | TARGET_APP="Spotify" npx tsx readme_examples/basics/01-quick-start.ts |
| TARGET_APPS | Comma-separated | 13, 14 | TARGET_APPS="Safari,Music" npx tsx readme_examples/capture-targets/13-multi-app-capture.ts |
| TARGET_WINDOW | Window ID | 15, 17 | TARGET_WINDOW=12345 npx tsx readme_examples/capture-targets/15-window-capture.ts |
| TARGET_DISPLAY | Display ID | 16, 18 | TARGET_DISPLAY=1 npx tsx readme_examples/capture-targets/16-display-capture.ts |
| VERIFY | 1 or true | 13 | VERIFY=1 npx tsx readme_examples/capture-targets/13-multi-app-capture.ts |
Tip: Run
npx tsx readme_examples/basics/11-find-apps.tsto list available apps and their names. Window/display IDs are printed when running the respective capture examples.Important: Environment variables must be placed before the command, not after.
TARGET_APP="Spotify" npx tsx ...works, butnpx tsx ... TARGET_APP="Spotify"does not.
Platform Support
| macOS Version | Support | Notes | |---------------|---------|-------| | macOS 15+ (Sequoia) | ⚠️ Known issues | Single-process audio capture limitation (use server/client) | | macOS 14+ (Sonoma) | ✅ Full | Recommended | | macOS 13+ (Ventura) | ✅ Full | Minimum required | | macOS 12.x and below | ❌ No | ScreenCaptureKit not available | | Windows/Linux | ❌ No | macOS-only framework |
Note: On macOS 15+, only one process can capture audio at a time via ScreenCaptureKit. If you need multiple processes to receive audio, use the Multi-Process Capture Service.
Performance
Typical (Apple Silicon M1):
- CPU: <1% for stereo Float32
- Memory: ~10-20MB
- Latency: ~160ms (configurable)
Optimization tips:
- Use
minVolumeto filter silence - Use
format: 'int16'for 50% memory reduction - Use
channels: 1for another 50% reduction
Contributing
git clone https://github.com/mrlionware/screencapturekit-audio-capture.git
cd screencapturekit-audio-capture
npm install
npm run build
npm testLicense
MIT License - see LICENSE
Made with ❤️ for the Node.js and macOS developer community
