@karaoke-cms/enrich
v0.18.1
Published
AI enrichment pipeline for karaoke-cms
Readme
@karaoke-cms/enrich
AI enrichment pipeline for karaoke-cms. Reads published vault content and writes AI-generated metadata (reading_time, tags, description, related) back into frontmatter.
Usage
CLI
# Enrich all published files in your vault
KARAOKE_VAULT=./my-vault karaoke-enrich
# Enrich only specific files (e.g. from a git pre-commit hook)
karaoke-enrich --staged path/to/post.md another/post.mdLibrary
import { run, defineProvider } from '@karaoke-cms/enrich'
// Use a built-in provider (reads API key from env)
const result = await run({ vault: '/absolute/path/to/vault' })
console.log(`Enriched ${result.enriched} files, wrote ${result.written}`)
// Use a custom provider — no API key required
const myProvider = defineProvider({
async enrich(body) {
return { tags: ['custom'], description: 'Custom description.' }
}
})
const result = await run({ vault: '/path/to/vault', provider: myProvider })Configuration
All options can be passed to run() or set via environment variables:
| Option | Env var | Default |
|---|---|---|
| vault | KARAOKE_VAULT | — (required) |
| provider | ENRICH_PROVIDER | 'openai' |
| maxEnrich | MAX_ENRICHMENTS_PER_RUN | 20 |
| relatedMax | RELATED_MAX | 3 |
| dryRun | DRY_RUN | false |
| cachePath | ENRICH_CACHE_PATH | {cwd}/.enrich-cache.json |
| stagedFiles | — | null (all files) |
What it does
For each published Markdown file that hasn't been enriched yet (or whose body has changed):
- Title — generates a title for untitled documents, writes
title - Reading time — counts words, writes
reading_timein minutes - Tags — asks the AI for 3–6 tags, writes
tagsarray - Description — asks the AI for a 20–30 word summary, writes
description - Related posts — computes tag overlap across all posts, writes
related(slugs)
Results are cached in .enrich-cache.json (gitignored) — unchanged files are skipped. A per-run cap prevents runaway API costs.
Privacy gate: only processes files with publish: true. Private vault notes are never sent to an AI provider.
Custom providers
Any object with an enrich(body) method works as a provider:
import { run, defineProvider } from '@karaoke-cms/enrich'
const llamaProvider = defineProvider({
async enrich(body) {
const result = await callYourLlama(body)
return { tags: result.keywords, description: result.summary }
}
})
await run({ vault: '/path/to/vault', provider: llamaProvider })Built-in providers (openai, anthropic) are also importable directly:
import { enrich } from '@karaoke-cms/enrich/providers/openai'
import { enrich, parseEnrichResponse } from '@karaoke-cms/enrich/providers/anthropic'Pre-commit hook
Add to .githooks/pre-commit (or wherever your project configures hooks):
#!/usr/bin/env zsh
staged_md=("${(@f)$(git diff --cached --name-only | grep '\.md$' || true)}")
[[ ${#staged_md[@]} -eq 0 ]] && exit 0
[[ -z "${OPENAI_API_KEY:-}" && -z "${ANTHROPIC_API_KEY:-}" ]] && exit 0
pnpm exec karaoke-enrich --staged "${staged_md[@]}" || true
git add "${staged_md[@]}" .enrich-cache.jsonWhat's new in 0.18.0
- AI-generated titles -- untitled documents now receive an AI-generated
titlein frontmatter, so every published page has a meaningful heading without manual effort. - Shorter descriptions -- the
descriptionfield now targets 20--30 words instead of longer summaries, producing tighter copy for meta tags and link previews.
What's new in 0.17.0
No user-facing changes in this release.
Return value
run() returns a structured result:
{
enriched: number // files where the AI provider was called
written: number // files written to disk
skipped: number // files skipped (cache hit, no change needed)
errors: Array<{ path: string, error: Error }>
}