@bwthomas/ibid

v0.6.0

Published

8 days ago

Extracts and normalizes citation metadata.

Downloads

510

0High
0Medium
0Low

bwthomas

citation bibliography metadata

ibid

Tools for extracting and normalizing citation metadata in JavaScript. Designed to run in both browser and server contexts.

See SPEC.md for the authoritative behavioral contract.

Tuning for latency and cost

Several knobs control the latency/cost/quality trade-off. The defaults are tuned for a representative mixed-input corpus; consumers with narrower inputs or tighter latency budgets can adjust:

Per-strategy overrides (SPEC §8.1.1.1). IbidOptions.strategyOverrides takes a { [strategyName]: { enabled?, fallback?, minCurrentBestConfidence? } } map. enabled: false removes a built-in from the pipeline; fallback: true promotes it to the post-primary tier where ctx.currentBest is visible; minCurrentBestConfidence: N layers a tighter gate on top of the strategy's own shouldRun. Canonical use: { CitoidUrl: { fallback: true } } — in one measurement this cut mean URL-extraction latency ~8× at a −1.3pp quality cost.
LLM fallback threshold (SPEC §9.3). The built-in Llm strategy fires only when folded primary-tier confidence is below options.llmPrompts.fallbackConfidenceThreshold. Default changed from 75 → 50 on 2026-04-23: the prior default caused the LLM to fire on nearly every URL for marginal gain. Consumers with unusually sparse-metadata corpora can raise it.
Bedrock streaming on by default (SPEC §9.2.1). createBedrockLlm({ useStreaming }) defaults to true, using InvokeModelWithResponseStream under the hood. Measured 51% wall-clock improvement on small outputs (≤200 tokens, where most ibid calls land); the public LlmResponse shape is unchanged. Opt out with useStreaming: false for ≥1000-token responses where the buffered path is faster.
CrossRefFreetext LLM rescue (SPEC §8.1.2). Pass an LlmAdapter to the CrossRefFreetext adapter (subpath /article-crossref-freetext) and it will re-rank weak CrossRef top-1 results via the LLM. Rescue fires only when raw CrossRef score is low or title-token overlap is thin, so cost is bounded to the tail of genuinely ambiguous queries. Failures degrade to the original CrossRef order.

Production config recommendation

For a web-reader/toolbar-style consumer with mixed URL/DOI/ISBN/RIS input and a low-latency budget, a measurement on 2026-04-23 (analysis, not in-repo) found the best latency/quality balance from:

strategyOverrides: { CitoidUrl: { fallback: true } }
createBedrockLlm(...) with default useStreaming: true
llmPrompts.fallbackConfidenceThreshold: 50 (the new default)
CrossRefFreetext adapter wired with an LLM for rescue

License

MIT — see LICENSE.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

ibid

Tuning for latency and cost

Production config recommendation

License