@caitdesk/caption-processor
v0.1.0
Published
Standalone caption processing library — transforms and text accumulation for real-time caption streams
Downloads
45
Maintainers
Readme
@caitdesk/caption-processor
Standalone caption processing library — transforms and text accumulation for real-time caption streams. Zero external dependencies.
Install
bun add @caitdesk/caption-processorQuick Start
import { CaptionProcessor } from '@caitdesk/caption-processor';
const processor = new CaptionProcessor({
eventTransforms: {
'profanity-filter': { enabled: true, data: { words: ['badword'] } },
},
});
const output = processor.process({ type: 'append', text: 'Hello world', seq: 1, ts: Date.now() });
// { text: 'Hello world', seq: 1, ts: 1708000000000, lineTs: [1708000000000] }
processor.process({ type: 'append', text: '\nLine two', seq: 2, ts: Date.now() });
// { text: 'Hello world\nLine two', seq: 2, ts: ..., lineTs: [1708000000000, 1708000000001] }
processor.process({ type: 'delete', count: 3, seq: 3, ts: Date.now() });
// { text: 'Hello world\nLine', seq: 3, ts: ..., lineTs: [...] }API
High-Level: CaptionProcessor
Combines the transform pipeline with text accumulation into a single class. Best for external consumers.
import { CaptionProcessor } from '@caitdesk/caption-processor';
const processor = new CaptionProcessor({
orgTransforms: { ... }, // Organization-level transform settings
eventTransforms: { ... }, // Event-level overrides
captioner: { // Per-captioner config
id: 'captioner-123',
transforms: { 'text-replacer': { enabled: true, data: { pairs: [{ from: 'brb', to: 'be right back' }] } } },
},
});| Method | Description |
|--------|-------------|
| process(msg: CaptionInput): CaptionOutput \| null | Run transforms + accumulate text. Returns null if filtered. |
| updateConfig(config) | Update transform config mid-stream. |
| getState(): CaptionOutput | Get current accumulated state. |
| getText(): string | Get current full text. |
| reset() | Clear all accumulated state. |
Low-Level: Individual Exports
For advanced use cases where you need direct control over transforms and text state separately.
TextState
Stateful text accumulator. Processes append/delete/sync messages and tracks per-line creation timestamps.
import { TextState } from '@caitdesk/caption-processor';
const state = new TextState();
state.process({ type: 'append', text: 'Hello', seq: 1, ts: 1000 });
// { text: 'Hello', seq: 1, ts: 1000, lineTs: [1000] }
state.process({ type: 'append', text: '\b\blo', seq: 2, ts: 2000 });
// { text: 'Hello', seq: 2, ts: 2000, lineTs: [1000] }
// (\b deletes 'lo', then 'lo' is re-appended)| Method | Description |
|--------|-------------|
| process(msg: CaptionInput): CaptionOutput | Process a message and return current state. |
| getFullText(): string | Current accumulated text. |
| getLastSeq(): number | Last processed sequence number. |
| getLineTs(): number[] | Per-line creation timestamps (copy). |
| restoreLineTs(lineTs: number[]) | Restore timestamps from persistence (validates length). |
| reset() | Clear all state. |
Message types:
append— Append text. Supports\b(backspace) characters for corrections.delete— Deletecountcharacters from end.sync— Replace entire text (full state reset).
processCaption
Run a message through the transform pipeline without text accumulation.
import { processCaption } from '@caitdesk/caption-processor';
import type { PipelineContext } from '@caitdesk/caption-processor';
const ctx: PipelineContext = {
eventConfig: resolvedConfig, // from resolveTransformConfig()
captioner: { id: '...', transforms: { ... } },
};
const result = processCaption(msg, ctx);
// null if filtered, transformed CaptionInput otherwiseresolveTransformConfig
Merge org/event settings with transform defaults into a resolved config map.
import { resolveTransformConfig } from '@caitdesk/caption-processor';
const config = resolveTransformConfig(orgTransforms, eventTransforms);
// Map<string, TransformConfigEntry> — only enabled transformsResolution order: event override > org default > transform.defaultEnabled. Data merge is shallow: event data overrides org data per-key.
Built-in Transforms
| ID | Level | Default | Description |
|----|-------|---------|-------------|
| char-replacer | system | enabled | Replaces -> with →, 1/2 with ½, ... with … |
| empty-filter | system | enabled | Filters out empty text messages |
| text-replacer | captioner | disabled | Per-captioner macro expansion (e.g., brb → be right back) |
| profanity-filter | optional | disabled | Word-list replacement with *** |
Transform Levels
- System — Always run, no configuration needed.
- Optional — Enabled/disabled per org or event via
TransformsSettings. - Captioner — Two-gate: org/event must enable the feature, and the captioner must have personal config with
enabled: true.
Custom Transforms
Register custom transforms using the register function. Registration order determines execution order.
import { register } from '@caitdesk/caption-processor';
import type { TransformDefinition } from '@caitdesk/caption-processor';
register({
id: 'my-transform',
name: 'My Transform',
description: 'Does something custom',
level: 'optional',
defaultEnabled: false,
transform: (msg, config, ctx) => {
if (msg.type === 'delete') return msg;
// Modify msg.text, or return null to filter
return { ...msg, text: msg.text.toUpperCase() };
},
});Important: Register transforms before creating a CaptionProcessor or calling resolveTransformConfig, since both read from the registry at call time.
Types
// Input
type CaptionInput = (AppendInput | DeleteInput | SyncInput) & { seq: number; ts: number };
// Output
type CaptionOutput = { text: string; seq: number; ts: number; lineTs: number[] };
// Config
type TransformConfigEntry = { enabled?: boolean | null; data?: Record<string, unknown> | null };
type TransformsSettings = Record<string, TransformConfigEntry>;
type CaptionerConfig = { id: string; transforms?: TransformsSettings };
// Transforms
type TransformFn = (msg: CaptionInput, config: TransformConfigEntry | undefined, ctx: PipelineContext) => CaptionInput | null;
type TransformDefinition = { id: string; name: string; description: string; level: TransformLevel; defaultEnabled: boolean; transform: TransformFn };
type PipelineContext = { eventConfig?: Map<string, TransformConfigEntry>; captioner?: CaptionerConfig };