plosive

v1.0.2

Published

a month ago

A browser-based voice recording app for reading scripts aloud, one cue card at a time.

Downloads

262

0High
0Medium
0Low

catpea

Plosive

A browser-based voice recording app for reading scripts aloud, one cue card at a time.

Paste a script, poem, or narration. Plosive splits it into cue cards on blank lines. Step through each card, record your voice, trim silence, preview playback, and export the final audio.

Getting Started

npm run dev

Opens on http://localhost:8070. Paste your text, click Start, and allow microphone access.

No build step. No dependencies. Just Node.js.

How It Works

Paste text -- blank lines (\n\n) become card boundaries, --- becomes a section break
Start -- creates a project, opens the microphone once (hot mic, no re-init)
Record -- each card holds one or more audio segments
Trim -- auto-detect silence or manually select regions to trim
Preview -- click-and-hold the time ruler to scrub, or press P/Space to play
Export -- Build merges all segments, applies trims, and encodes to a downloadable file

Keyboard Shortcuts

| Key | Action | |-----|--------| | R | Record / Stop recording | | P | Play current card | | Space | Play selection, or full card | | C | Clear all segments on current card | | E | Edit card text | | I | Attach image to card | | D | Delete attached image | | X | Create trim from selection / Remove selected trim | | A | Auto-trim silence | | Q | Toggle section break after card | | L | Toggle fluid / fixed layout | | , | Open config | | Esc | Cancel editing / Close config | | Arrows | Navigate between cards |

The shortcut bar at the bottom of the screen updates with context-sensitive actions.

Features

Spectrogram

Every recorded card gets a real-time spectrogram computed via Radix-2 FFT with Hann windowing. The spectrogram is scrollable, zoomable (auto-zooms when audio exceeds canvas width), and supports:

Time ruler with tick marks and second labels, click-and-hold to preview
Trim markers with draggable handles on both edges
Selection for targeted playback or trimming
Play position indicator with center-biased auto-scrolling
Scrollbar with thumb drag and track click

Non-destructive Trimming

Trim markers mask regions of audio without modifying the original recording. Trimmed regions are completely skipped during playback -- no silent gaps. Press X on a canvas selection to create a trim, or click a trim marker and press X to remove it. Press A to auto-detect and trim silence.

Cue Breaks

Press Q to toggle a section break (dashed line) after the current card. Breaks insert 500ms of silence in the final export.

Image Attachments

Press I to attach an image to a card. Images appear as thumbnails and are used by the slideshow export to time visuals to audio.

Live Preview

During recording, a live waveform or spectrogram preview appears on the current card. Toggle between modes in the config panel.

Project Persistence

When running with the server, projects are saved automatically:

Audio segments saved on recording stop
Images saved on attach
Trim markers, card state, and breaks saved on every change
Reload the page (F5) to restore full state

Project Structure

plosive/
  index.html              App shell
  style.css               All styles
  cue-cards.js            Main application (plugin architecture)
  server.js               Project management server
  plosive.config.json     Tunable parameters
  lib/
    bus.js                EventEmitter, Signal, App
  components/
    spectrogram.js        <plosive-spectrogram> web component
    preview.js            <plosive-preview> live recording visualization
    shortcut-bar.js       <plosive-shortcuts> context-sensitive shortcut bar
    config-modal.js       <plosive-config> settings modal with focus trap
  samples/
    poem-example.md       Example script
  projects/               Server-managed project storage

Architecture

The app uses a plugin architecture built on a minimal EventEmitter:

var app = new App();
app.use(parserPlugin);
app.use(navigationPlugin);
app.use(keyboardPlugin);
app.use(micPlugin);
app.use(recorderPlugin);
app.use(playerPlugin);
// ...

Each plugin receives the app instance and wires up event listeners, keyboard shortcuts, and state mutations. Plugins communicate through the event bus -- no direct coupling.

Web components (<plosive-spectrogram>, <plosive-shortcuts>, <plosive-preview>, <plosive-config>) encapsulate UI that needs its own lifecycle.

Server Routes

| Method | Path | Description | |--------|------|-------------| | GET | / | Redirect to /new | | GET | /new | Empty project page | | GET | /projects | Project list (HTML) | | GET | /project/:id | Load project | | GET | /project/:id.json | Full project state (JSON) | | POST | /api/projects | Create project | | PUT | /api/project/:id/state | Save state | | PUT/GET | /api/project/:id/audio/:file | Audio segments | | PUT/GET | /api/project/:id/image/:file | Image attachments | | GET | /project/:id/filelist.txt | FFmpeg concat demuxer file list | | GET | /project/:id/slideshow.sh | Generated slideshow build script | | GET | /health | Health check |

FFmpeg Integration

Concatenation

filelist.txt is served in ffmpeg concat demuxer format:

ffmpeg -f concat -safe 0 -i filelist.txt -c copy combined.webm

Slideshow

slideshow.sh generates a bash script that:

Concatenates all audio segments
Probes per-card durations
Overlays attached images timed to their cards
Outputs an H.264/AAC MP4 with loudness normalization

cd projects/<uuid>/
bash slideshow.sh output.mp4

Configuration

plosive.config.json provides tunable defaults:

{
  "autoTrim": {
    "threshold": 0.01,
    "maxSilenceMs": 500,
    "paddingMs": 250
  },
  "ruler": {
    "tickIntervalSec": 0.5
  }
}

threshold -- RMS level below which audio is considered silence
maxSilenceMs -- minimum silence duration to create a trim marker
paddingMs -- padding kept on each side of a silence region
tickIntervalSec -- time ruler tick spacing

License

MIT