@litomore/coli
v0.0.2
Published
A CLI for the Cola
Downloads
17
Readme
coli
Core library for Cola. Provides essential capabilities commonly used by agents.
CLI
npm i -g @litomore/coliUsage
$ coli <command> [options]
Commands
asr Transcribe an audio file using speech recognition
Examples
$ coli asr recording.m4a
$ coli asr -j recording.m4a
$ coli asr --model whisper recording.wavcoli asr
Transcribe an audio file using speech recognition.
# Plain text output
coli asr recording.m4a
# JSON output
coli asr -j recording.m4a
# Select model
coli asr --model whisper recording.wavOptions
-j, --json Output result in JSON format
--model Model to use: whisper, sensevoice (default: sensevoice)JSON output example
{
"text": "The tribal chieftain called for the boy.",
"model": "sensevoice-small",
"lang": "<|en|>",
"emotion": "<|NEUTRAL|>",
"event": "<|Speech|>",
"tokens": ["The", " tri", "bal", " chief", "tain", "..."],
"timestamps": [0.9, 1.26, 1.56, 1.8, 2.16, "..."],
"duration": 7.152
}API
npm i @litomore/coliASR
ensureModels()
Check for required models and download any that are missing. Call this before runAsr.
import {ensureModels} from '@litomore/coli';
await ensureModels();runAsr(filePath, options)
Run speech recognition on an audio file. Results are printed to stdout.
import {ensureModels, runAsr} from '@litomore/coli';
await ensureModels();
// Plain text output
await runAsr('recording.m4a', {json: false, model: 'sensevoice'});
// JSON output
await runAsr('recording.m4a', {json: true, model: 'whisper'});Options
| Property | Type | Description |
| -------- | --------------------------- | ----------------------------------------------------------------------------- |
| json | boolean | Output JSON (with model name, tokens, timestamps, etc.) instead of plain text |
| model | 'whisper' \| 'sensevoice' | Which model to use for recognition |
getModelPath(model)
Returns the local filesystem path for a given model.
import {getModelPath} from '@litomore/coli';
getModelPath('sensevoice');
// => '/Users/you/.coli/models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17'
getModelPath('whisper');
// => '/Users/you/.coli/models/sherpa-onnx-whisper-tiny.en'modelDisplayNames
A mapping from model key to its human-readable display name.
import {modelDisplayNames} from '@litomore/coli';
modelDisplayNames.sensevoice; // => 'sensevoice-small'
modelDisplayNames.whisper; // => 'whisper-tiny.en'Models
On first run, coli automatically downloads ASR models to ~/.coli/models/:
| Name | Model | Languages |
| ---------------------- | ------------------------------------------------------------------ | --------------------------------------------- |
| sensevoice (default) | SenseVoice Small int8 | Chinese, English, Japanese, Korean, Cantonese |
| whisper | Whisper tiny.en int8 | English |
Supported audio formats
WAV files are passed directly to the recognizer. All other formats (m4a, mp3, ogg, flac, etc.) are automatically converted to 16 kHz mono WAV via ffmpeg.
License
MIT
