@ncukondo/search-hub
v0.23.1
Published
A CLI tool for systematic literature searching across multiple academic databases
Downloads
735
Maintainers
Readme
@ncukondo/search-hub
A CLI tool for systematic literature searching across multiple academic databases.
Features
- Multi-database search: PubMed, ERIC, arXiv, Scopus (Web of Science, Embase planned)
- Unified query syntax: YAML-based DSL with automatic translation and JSON Schema support
- Controlled vocabulary validation: Validates MeSH, ERIC descriptors, and Emtree terms with typo suggestions
- Reproducible searches: Full session logging for PRISMA reporting
- Result filtering: Flexible query expressions (
-q) to search and filter results by title, abstract, author, year, and more - Coverage verification: Check whether known articles appear in search results for query quality validation
- Session comparison: Diff results between query iterations to track refinements
- Resume support: Continue interrupted searches at DB or page level
- Review workflow: Multi-reviewer screening with agreement tracking and finalization
- Fulltext management: OA discovery, automatic retrieval, PMC XML to Markdown conversion
- Reference manager integration: Works with reference-manager
Installation
Binary (no Node.js required)
Download the latest binary for your platform:
Linux / macOS (Intel & Apple Silicon):
curl -fsSL https://raw.githubusercontent.com/ncukondo/search-hub/main/install.sh | bashWindows (PowerShell):
irm https://raw.githubusercontent.com/ncukondo/search-hub/main/install.ps1 | iexOr download manually from GitHub Releases.
npm
npm install -g @ncukondo/search-hubRequires Node.js 22+.
Quick Start
- Initialize a project:
search-hub initThis creates a .search-hub/ directory in the current folder with project config, sessions, and queries.
For global setup (API keys, user preferences):
search-hub init --globalThis creates a config file in your platform-specific config directory:
| Platform | Global Config |
|----------|---------------|
| Linux | ~/.config/search-hub/config.toml |
| macOS | ~/Library/Preferences/search-hub/config.toml |
| Windows | %APPDATA%/search-hub/Config/config.toml |
- Create a query file:
search-hub query init "my review"This creates queries/my-review.yaml with JSON Schema support for editor autocompletion. Edit it to define your search:
# yaml-language-server: $schema=./query.schema.json
name: "my review"
description: "Literature search for scoping review"
query:
- field: title_abstract
terms:
keywords:
- diabetes
- "machine learning"
operator: OR
filters:
year_from: 2020
language:
- en- Validate the query:
search-hub query validate my-reviewThis checks structure, validates controlled vocabulary terms (MeSH, ERIC descriptors, Emtree) against external APIs, and suggests corrections for typos. The query name is automatically resolved to queries/my-review.yaml.
- Run search:
search-hub search my-review- Export results:
search-hub export <session-id> --format idsQuery Development
Developing an effective search query is iterative. Start broad, then refine based on results.
Workflow
Create a query - Start with a template:
search-hub query init "my review" # Creates queries/my-review.yamlCheck hit counts - Preview before downloading:
search-hub search my-review --count-onlyRun the search - When counts look good:
search-hub search my-reviewReview results - Check titles to assess quality:
search-hub results <session-id> --limit 50 search-hub results <session-id> -q "title:diabetes year:2023-2025"Refine and re-run - Edit the query file, then iterate:
$EDITOR queries/my-review.yaml search-hub search my-review --count-only # Re-check counts search-hub search my-review # Execute full searchCompare results with diff - See what changed:
search-hub diff <session-v1> <session-v2> --show removedThis shows articles excluded by your refinements. Review these to ensure you're not losing relevant papers.
Tips for Effective Refinement
Use
--count-onlyfirst: Check hit counts before downloading full results.search-hub search my-review --count-onlyUse
--previewto see hit counts with sample titles:search-hub search my-review --previewUse
--dry-runto preview translations: See exactly what query each database will receive.search-hub search my-review --dry-runCompare removed articles carefully: When narrowing a search,
--show removedreveals what you're excluding. If important papers are removed, your refinement may be too aggressive.Track iterations: Use
query assessandquery logto record and review your refinement history.
Fulltext Retrieval
After screening, retrieve fulltext articles for included papers:
# Check Open Access availability
search-hub fulltext check --session <session-id>
# Download available OA fulltexts (auto-converts PMC XML to Markdown)
search-hub fulltext fetch <session-id>
# For non-OA articles: create directories for manual download
search-hub fulltext init <session-id>
search-hub fulltext pending <session-id>
# After manually adding PDFs, sync and register
search-hub fulltext sync <session-id>
search-hub register <session-id>See Fulltext Management Guide for details.
Documentation
- Query Guide - How to write query files (DSL, JSON Schema, vocabulary validation)
- Command Reference - All CLI commands and options
- Configuration - Setup and configuration
- Databases - Supported databases, controlled vocabularies, and tips
- Fulltext Management - Fulltext retrieval and management
Development
# Install dependencies
npm install
# Run tests
npm test
# Lint
npm run lint
# Build
npm run buildLicense
MIT
