paper-search-cli
v0.1.3
Published
Agent-friendly CLI for searching and downloading academic papers from multiple sources.
Maintainers
Readme
Paper Search CLI
Paper Search CLI is a standalone Node.js command line tool for searching, validating, and downloading academic papers from multiple scholarly sources. It is designed for direct terminal use, automation scripts, and agent workflows that need a stable command surface with predictable JSON output.
It keeps the broad platform coverage, unified paper model, and detailed capability descriptions of the earlier Paper Search implementation, but runs as a normal CLI process. There is no long-running background service to configure, start, or keep alive.
Thanks to the sincere, friendly, collaborative, and professional LinuxDo community. The CLI + Skill direction and the paper-search workflow refinements in this project were shaped by LinuxDo discussions and open-source sharing.
Quick Start · Configuration · Agent Skill · Supported Platforms · Commands · Tool Reference · Troubleshooting
Design Goals
- Free-first retrieval: prefer public metadata and open-access full-text routes before restricted or fragile sources.
- One command surface: keep search, status, download, and precise tool calls behind the same executable.
- Agent-safe output: produce predictable JSON that can be parsed without scraping terminal text.
- Transparent source behavior: document which platforms provide metadata only, which can download PDFs, and which need API keys.
- No hidden background process: each command starts, returns a result, and exits.
Key Features
- 25 academic sources/platforms: Crossref, OpenAlex, PubMed, PubMed Central, Europe PMC, arXiv, bioRxiv, medRxiv, Semantic Scholar, CORE, OpenAIRE, DBLP, ACM Digital Library metadata, USENIX metadata, OpenReview, Web of Science, Google Scholar, IACR ePrint, Sci-Hub, IEEE Xplore, ScienceDirect, Springer Nature/SpringerLink, Wiley, Scopus, and Unpaywall.
- Single command interface: install once, then call
paper-searchfrom terminal, scripts, or agents. - JSON-first output: stdout is machine-readable JSON by default; stderr is reserved for human-readable diagnostics.
- Unified paper model: normalized title, authors, DOI, source, dates, abstract, PDF URL, citation count, and provider-specific metadata where available.
- Multi-source search with dedupe: query selected sources with
--sources crossref,openalex,pmc, or useplatform=allto try every registered search source, then merge duplicates by DOI and title/author keys. - Semantic Scholar body-snippet search:
search_semantic_snippetssearches Semantic Scholar's Open Access snippet index for body-text snippets, which is useful for finding methodological details. It requiresSEMANTIC_SCHOLAR_API_KEY. - Funnel-style fallback download:
download_with_fallbacktries native source download, discovered PDF URLs, PMC/Europe PMC/CORE/OpenAIRE, Unpaywall DOI resolution, then Sci-Hub as the final fallback unlessuseSciHub=false. - Rate limits and retry logic: platform-specific rate limiting and retryable API error handling.
- PDF download support: download from supported sources such as arXiv, bioRxiv, medRxiv, Semantic Scholar, IACR, Sci-Hub, Springer open access, and Wiley DOI-based access.
- Agent-friendly commands:
tools,status,search,download, andruncover both simple use and precise advanced calls.
Quick Start
Install
Requires Node.js >= 18.0.0 and npm.
npm install -g paper-search-cli
paper-search setup
paper-search search "machine learning" --platform crossref --max-results 3 --prettyRun paper-search setup after installation to write optional API keys and emails into the user config.
For the Unpaywall and Crossref email prompts, you can press Enter and the CLI will write a random Gmail-format address automatically; use paper-search config set later if you want to replace it with your own email.
For local development, or to test changes that have not been released yet, install from source:
git clone [email protected]:dr-dumpling/paper-search-cli.git
cd paper-search-cli
npm install
npm run build
npm install -g .Common Checks
paper-search status --pretty
paper-search tools --pretty
paper-search config doctor --prettySupported Platforms
Platform Families
The table below remains the source-of-truth for capabilities. For choosing a source quickly, use these broad families:
| Family | Platforms | Best For |
| --- | --- | --- |
| General scholarly metadata | Crossref, OpenAlex, Semantic Scholar, Google Scholar | Broad discovery, DOI metadata, citation clues, first-pass literature search |
| Medicine / life sciences | PubMed, PubMed Central, Europe PMC | Clinical, biomedical, public health, biomedical metadata, and open full text |
| Preprints / conference papers | arXiv, bioRxiv, medRxiv, OpenReview, IACR ePrint | Cross-disciplinary preprints, life-science/medical preprints, AI/ML submissions, and cryptography ePrints |
| Computer science / engineering | DBLP, ACM Digital Library metadata, IEEE Xplore, USENIX | CS bibliography, engineering databases, systems/security proceedings |
| Open full text / repositories | CORE, OpenAIRE, Unpaywall | Cross-disciplinary repository discovery and open-access PDF fallback routes |
| Citation indexes / publishers | Web of Science, Scopus, ScienceDirect, Springer Nature/SpringerLink, Wiley | Institution-backed metadata, citation databases, publisher-specific records and downloads |
| DOI-targeted lookup | Sci-Hub | DOI-based retrieval and the final automatic PDF fallback unless useSciHub=false |
Some platforms belong to more than one practical workflow. For example, Semantic Scholar is useful for broad discovery and CS/AI, while arXiv covers CS, math, physics, and quantitative fields. These groups reflect the primary way a platform is used; CS searches often combine "computer science / engineering" with "preprints / conference papers."
Capability Matrix
General Scholarly Metadata
| Platform | Search | Download | Full Text | Citations | API Key | Special Features | | --- | --- | --- | --- | --- | --- | --- | | Crossref | ✅ | ❌ | ❌ | ✅ | ❌ | Default search platform, broad metadata coverage | | OpenAlex | ✅ | 🟡 Conditional | ❌ | ✅ | ❌ | Broad free metadata; can feed fallback downloads when records include OA links | | Semantic Scholar | ✅ | ✅ | ✅ Body snippets | ✅ | 🟡 Optional* | AI semantic search + OA body snippets | | Google Scholar | ✅ | ❌ | ❌ | ✅ | ❌ | Broad academic discovery, scrape-based |
Medicine / Life Sciences
| Platform | Search | Download | Full Text | Citations | API Key | Special Features | | --- | --- | --- | --- | --- | --- | --- | | PubMed | ✅ | ❌ | ❌ | ❌ | 🟡 Optional | Biomedical literature through NCBI E-utilities | | PubMed Central | ✅ | ✅ | ✅ | ❌ | ❌ | Open biomedical full text and PMC PDFs | | Europe PMC | ✅ | ✅ | ✅ | ❌ | ❌ | Biomedical metadata plus open full-text links |
Computer Science / Engineering
| Platform | Search | Download | Full Text | Citations | API Key | Special Features | | --- | --- | --- | --- | --- | --- | --- | | DBLP | ✅ | ❌ | ❌ | ❌ | ❌ | Computer science bibliography through the official DBLP search API | | ACM Digital Library | ✅ | ❌ | ❌ | ✅ | ❌ | ACM DOI-prefix metadata through Crossref; no ACM scraping | | USENIX | ✅ | ❌ | ❌ | ❌ | ❌ | DBLP-backed USENIX proceedings metadata; no USENIX search-page scraping | | IEEE Xplore | ✅ | ❌ | ❌ | ✅ | ✅ Required | IEEE metadata through the official IEEE Xplore Metadata API |
Preprints / Conference Papers
| Platform | Search | Download | Full Text | Citations | API Key | Special Features | | --- | --- | --- | --- | --- | --- | --- | | arXiv | ✅ | ✅ | ✅ | ❌ | ❌ | Physics, CS, math, and related preprints | | bioRxiv | ✅ | ✅ | ✅ | ❌ | ❌ | Biology preprints | | medRxiv | ✅ | ✅ | ✅ | ❌ | ❌ | Medical preprints | | OpenReview | ✅ | ❌ | ❌ | ❌ | ❌ | Conference submissions, reviews, and preprints through public OpenReview notes search | | IACR ePrint | ✅ | ✅ | ✅ | ❌ | ❌ | Cryptography papers |
Open Full Text / Repositories
| Platform | Search | Download | Full Text | Citations | API Key | Special Features | | --- | --- | --- | --- | --- | --- | --- | | CORE | ✅ | 🟡 Conditional | 🟡 Conditional | ❌ | 🟡 Optional | Downloads work when records include PDF or full-text links | | OpenAIRE | ✅ | 🟡 Conditional | ❌ | ❌ | 🟡 Optional | Can feed fallback downloads when records include open links | | Unpaywall | 🟡 Conditional | 🟡 Conditional | ❌ | ❌ | ✅ Required | DOI-only lookup; requires an email; downloads work when an OA PDF is found |
Citation Indexes / Publishers
| Platform | Search | Download | Full Text | Citations | API Key | Special Features |
| --- | --- | --- | --- | --- | --- | --- |
| Web of Science | ✅ | ❌ | ❌ | ✅ | ✅ Required | Citation database, date sorting, year ranges |
| ScienceDirect | ✅ | ❌ | ❌ | ✅ | ✅ Required | Elsevier metadata and abstracts |
| Springer Nature / SpringerLink | ✅ | 🟡 Conditional | ❌ | ❌ | ✅ Required | springerlink is an alias for the existing Springer Nature integration |
| Wiley | ❌ Keyword search | ✅ | ✅ | ❌ | ✅ Required | TDM API, DOI-based PDF download only |
| Scopus | ✅ | ❌ | ❌ | ✅ | ✅ Required | Abstract and citation database |
DOI-Targeted Lookup
| Platform | Search | Download | Full Text | Citations | API Key | Special Features | | --- | --- | --- | --- | --- | --- | --- | | Sci-Hub | ✅ | ✅ | ❌ | ❌ | ❌ | DOI-based paper lookup and PDF retrieval |
Notes:
- In capability columns,
✅means directly supported,❌means unsupported, and🟡 Conditionalmeans support depends on record content or provider constraints, such as DOI-only lookup, available PDF/OA links, or open-access-only downloads. - In the API Key column,
❌means no configuration is needed,🟡 Optionalmeans configuration improves limits or stability, and✅ Requiredmeans the key is required only when you use that platform, not that every new installation should configure it. Unpaywall requires an email rather than a traditional API key. - Wiley does not support keyword search through the Wiley TDM API. Use
search_crossrefto find Wiley articles and then usedownload_paperwithplatform=wileyand the DOI. - ACM and USENIX search intentionally use metadata-backed routes rather than crawling provider search pages, which keeps the integration compatible with robots.txt and reduces IP-blocking risk.
platform=alltries every registered search source except DOI-download-only providers such as Wiley. Sources without configured credentials, sources that time out, and sources that fail are recorded infailed_sources/errorswhile the remaining sources continue.--sourcesaccepts a comma-separated source list, for example--sources crossref,openalex,pmc.🟡 Optional*for Semantic Scholar means optional for regular search;search_semantic_snippetsbody-snippet search requiresSEMANTIC_SCHOLAR_API_KEY.
Configuration
Most free metadata sources work without configuration. For API keys and emails, prefer the user-level config file so the CLI works from any directory:
paper-search setup
paper-search config set SEMANTIC_SCHOLAR_API_KEY your_semantic_scholar_api_key_here
paper-search config set PAPER_SEARCH_UNPAYWALL_EMAIL [email protected] # optional: replace the setup-generated email
paper-search config list --pretty
paper-search config doctor --pretty
paper-search diagnostics --prettyThe default config path is:
~/.config/paper-search-cli/config.jsonThe file is written with 0600 permissions. config list and config doctor mask secrets.
paper-search setup is the guided setup command. By default it asks for the recommended credentials only: Semantic Scholar, Unpaywall email, Crossref email, and CORE. Use paper-search setup --all to walk through every supported configuration key, or paper-search setup --keys SEMANTIC_SCHOLAR_API_KEY,CORE_API_KEY to configure a specific subset.
To reduce first-run friction, if PAPER_SEARCH_UNPAYWALL_EMAIL / UNPAYWALL_EMAIL / CROSSREF_MAILTO are not configured, pressing Enter during setup writes a random Gmail-format address such as [email protected], so basic Unpaywall and Crossref requests can run immediately.
paper-search diagnostics --pretty lists every API-key or email-backed capability, the related config keys, whether the required keys are configured, common failure modes, and suggested next checks. Search commands also add a diagnostic field when a key-backed platform returns zero results or an auth/permission/rate-limit error.
API Key Recommendation
paper-search setup asks only for the credentials that are most useful for ordinary new users. ✅ Required in the platform table means "required for that platform", not "recommended for every installation".
| Level | Config keys | Recommended for new users | Notes |
| --- | --- | --- | --- |
| Default recommended | SEMANTIC_SCHOLAR_API_KEY | Yes | Enables Semantic Scholar body-snippet search for methodology details and improves request stability. |
| Default recommended | PAPER_SEARCH_UNPAYWALL_EMAIL or UNPAYWALL_EMAIL | Yes | Finds open-access PDFs from DOI records; this only needs an email, not an API key. Press Enter in setup to generate a random Gmail-format email, or replace it manually. |
| Default recommended | CROSSREF_MAILTO | Yes | Puts Crossref requests in the polite pool, which is better for long-running or frequent searches. Press Enter in setup to reuse the generated email, or replace it manually. |
| Default recommended | CORE_API_KEY or PAPER_SEARCH_CORE_API_KEY | Yes | CORE anonymous access is often rate-limited; a key makes open repository search more reliable. |
| Biomedical-heavy use | PUBMED_API_KEY, NCBI_EMAIL, NCBI_TOOL | Recommended if you use PubMed heavily | Raises NCBI E-utilities limits and identifies the client. |
| Institution entitlement | WOS_API_KEY | Configure only with Web of Science API access | Enables Web of Science search and citation data; requires Clarivate API entitlement. |
| Institution entitlement | IEEE_API_KEY | Configure only with IEEE Xplore API access | Enables IEEE Xplore metadata search; IEEE may require registered API access and product entitlement. |
| Institution entitlement | ELSEVIER_API_KEY | Configure only with Scopus or ScienceDirect API access | One Elsevier key does not automatically grant both products; Scopus and ScienceDirect need separate entitlements. |
| Institution entitlement | SPRINGER_API_KEY, SPRINGER_OPENACCESS_API_KEY | Configure only when you need Springer | Used for Springer metadata and open-access records; 401 usually means an invalid key or missing product access. |
| Institution entitlement | WILEY_TDM_TOKEN | Configure only with Wiley TDM/institutional full-text access | DOI-based download only; availability depends on the token and institutional subscription. |
| Usually unnecessary | PAPER_SEARCH_OPENAIRE_API_KEY or OPENAIRE_API_KEY | Not recommended by default | OpenAIRE public search usually works without a key; configure only for account or quota requirements. |
You can also import an existing .env:
paper-search config import-env .env --prettyConfig priority is:
- Shell environment variables.
- Current working directory
.env. - User config file.
- Built-in defaults for free sources.
For repo-local development, copying .env.example still works:
cp .env.example .envEnvironment Variables
# Web of Science, required for Web of Science search
WOS_API_KEY=your_web_of_science_api_key_here
WOS_API_VERSION=v1
# IEEE Xplore, required for IEEE metadata search
IEEE_API_KEY=your_ieee_api_key_here
# PubMed, optional; increases rate limit from 3 requests/sec to 10 requests/sec
PUBMED_API_KEY=your_ncbi_api_key_here
[email protected]
NCBI_TOOL=paper-search-cli
# Semantic Scholar, required for body-snippet search and useful for higher request limits
SEMANTIC_SCHOLAR_API_KEY=your_semantic_scholar_api_key_here
# Elsevier, required for Scopus and ScienceDirect; each product still needs separate entitlement
ELSEVIER_API_KEY=your_elsevier_api_key_here
# Springer Nature, required for Springer search and open access download
SPRINGER_API_KEY=your_springer_api_key_here
SPRINGER_OPENACCESS_API_KEY=your_openaccess_api_key_here
# Wiley TDM, required for Wiley DOI-based PDF download
WILEY_TDM_TOKEN=your_wiley_tdm_token_here
# Crossref polite pool, optional but recommended; setup can auto-generate/reuse a random Gmail-format email
[email protected]
# Unpaywall, required for DOI-based OA resolution; setup can auto-generate a random Gmail-format email
[email protected]
[email protected]
# CORE, optional but recommended; anonymous access is often heavily rate-limited
PAPER_SEARCH_CORE_API_KEY=your_core_api_key_here
CORE_API_KEY=your_core_api_key_here
# OpenAIRE, optional; public search works without a key
PAPER_SEARCH_OPENAIRE_API_KEY=your_openaire_api_key_here
OPENAIRE_API_KEY=your_openaire_api_key_hereAPI Key Sources
- Web of Science: Clarivate Developer Portal
- IEEE Xplore: IEEE Xplore Metadata API
- PubMed: NCBI API Keys
- Semantic Scholar: Semantic Scholar API
- Elsevier: Elsevier Developer Portal
- Springer Nature: Springer Nature Developers
- Wiley TDM: Wiley Text and Data Mining
- Unpaywall: Unpaywall Data Format and API
- CORE: CORE API
- OpenAIRE: OpenAIRE APIs
.env is ignored by git. Do not commit API keys or tokens.
Agent Skill
This repository includes an optional agent skill at skills/paper-search/SKILL.md. Install it into your agent's skill directory if your agent supports skills.
For example:
mkdir -p ~/.agents/skills/paper-search
cp skills/paper-search/SKILL.md ~/.agents/skills/paper-search/SKILL.mdThe skill only teaches the agent how to call the paper-search CLI. API keys are still configured through paper-search setup, paper-search config, .env, or shell environment variables. Do not store secrets in the skill file.
Output Contract
By default, every command writes JSON to stdout.
{
"ok": true,
"tool": "search_papers",
"message": "Found 1 papers.",
"data": []
}Use --pretty for formatted JSON:
paper-search search "machine learning" --platform crossref --max-results 1 --prettyUse --format text if you need the raw text response:
paper-search search "machine learning" --platform crossref --max-results 1 --format textUse --include-text to keep the raw response text alongside parsed JSON:
paper-search run search_crossref --arg query="machine learning" --arg maxResults=3 --include-text --prettyCommands
paper-search search
Unified search entrypoint.
paper-search search <query> [options]Examples:
paper-search search "machine learning" --platform crossref --max-results 10 --pretty
paper-search search "machine learning" --sources crossref,openalex --max-results 2 --pretty
paper-search search "cancer immunotherapy" --platform all --max-results 2 --pretty
paper-search search "transformer neural networks" --platform arxiv --category cs.AI --year 2023 --pretty
paper-search search "COVID-19 vaccine efficacy" --platform pubmed --max-results 20 --year 2023 --pretty
paper-search search "CRISPR gene editing" --platform webofscience --journal Nature --max-results 15 --prettyCommon options:
| Option | Description |
| --- | --- |
| --platform | Source platform. Default: crossref |
| --sources | Comma-separated source list for multi-source search, e.g. crossref,openalex,pmc |
| --max-results | Maximum result count |
| --year | Year filter, e.g. 2024, 2020-2024, 2020- |
| --author | Author name filter |
| --journal | Journal name filter |
| --category | Category filter, mainly arXiv/bioRxiv/medRxiv |
| --days | Days back for bioRxiv/medRxiv |
| --sort-by | relevance, date, or citations |
| --sort-order | asc or desc |
paper-search run
Run a specific internal tool by name. This is the most precise command for agent workflows.
paper-search run <tool-name> --arg key=value --arg key=value
paper-search run <tool-name> --json-args '{"key":"value"}'
paper-search run <tool-name> --json-args @args.jsonExamples:
paper-search run search_crossref --arg query="machine learning" --arg maxResults=5 --pretty
paper-search run search_papers --json-args '{"query":"machine learning","sources":"crossref,openalex","maxResults":2}' --pretty
paper-search run search_pubmed --json-args '{"query":"osteoarthritis","maxResults":5,"sortBy":"date"}' --pretty
paper-search run get_paper_by_doi --arg doi="10.1038/nature12373" --prettypaper-search tools
List all available tool names, descriptions, and input schemas.
paper-search tools --prettypaper-search status
Show platform capabilities and API key status. Secrets are never printed.
paper-search status --pretty
paper-search status --validate --pretty--validate may make live provider requests. Use it when you intentionally want credential validation.
paper-search diagnostics
Show API-key-backed capabilities and troubleshooting guidance. This does not print secrets.
paper-search diagnostics --prettyWhen a command returns zero results from a configured key-backed source, or fails with 401, 403, 400, or 429, JSON output includes a diagnostic field with likely causes and next actions.
paper-search config
Manage the user-level config file.
paper-search config init --pretty
paper-search config set SEMANTIC_SCHOLAR_API_KEY your_key --pretty
paper-search config set PAPER_SEARCH_UNPAYWALL_EMAIL [email protected] --pretty # optional: replace the setup-generated email
paper-search config import-env .env --pretty
paper-search config list --pretty
paper-search config doctor --pretty
paper-search config path --pretty
paper-search config keys --prettypaper-search download
Download a paper PDF through a platform that supports downloads.
paper-search download <paper-id-or-doi> --platform <platform> [--save-path ./downloads]Examples:
paper-search download 2301.00001 --platform arxiv --save-path ./downloads
paper-search download 10.1000/example --platform scihub --save-path ./downloads
paper-search download 10.1111/jtsb.12390 --platform wiley --save-path ./downloads
paper-search run download_with_fallback --arg source=arxiv --arg paperId=1201.0490 --arg doi=10.48550/arxiv.1201.0490 --arg savePath=./downloads --prettyTool Reference
These names can be used with paper-search run.
search_papers
Search across the unified dispatcher.
paper-search run search_papers --json-args '{"query":"machine learning","platform":"crossref","maxResults":10,"year":"2023","sortBy":"date"}' --prettySupported platforms:
crossref, arxiv, webofscience, wos, pubmed, biorxiv, medrxiv, semantic,
iacr, googlescholar, scholar, scihub, ieee, sciencedirect, springer,
springerlink, scopus, openalex, unpaywall, pmc, europepmc, core,
openaire, dblp, acm, usenix, openreview, allFor multi-source search, pass sources:
paper-search run search_papers --json-args '{"query":"machine learning","sources":"crossref,openalex,pmc","maxResults":2}' --prettysearch_crossref
Search Crossref, the default free metadata source.
paper-search run search_crossref --arg query="machine learning" --arg maxResults=10 --arg year=2023 --arg sortBy=relevance --arg sortOrder=desc --prettysearch_arxiv
Search arXiv preprints.
paper-search run search_arxiv --arg query="transformer neural networks" --arg maxResults=10 --arg category=cs.AI --arg year=2023 --arg sortBy=date --arg sortOrder=desc --prettysearch_pubmed
Search PubMed/MEDLINE biomedical literature.
paper-search run search_pubmed --json-args '{"query":"COVID-19 vaccine efficacy","maxResults":20,"year":"2023","journal":"New England Journal of Medicine","publicationType":["Journal Article","Clinical Trial"],"sortBy":"date"}' --prettyOpen Metadata And Full-Text Sources
Use these commands for open metadata search, open full-text discovery, and fallback PDF lookup:
paper-search run search_openalex --arg query="machine learning" --arg maxResults=3 --pretty
paper-search run search_unpaywall --arg query="10.48550/arxiv.1201.0490" --pretty
paper-search run search_pmc --arg query="cancer immunotherapy" --arg maxResults=3 --pretty
paper-search run search_europepmc --arg query="cancer genomics" --arg maxResults=3 --pretty
paper-search run search_core --arg query="machine learning" --arg maxResults=3 --pretty
paper-search run search_openaire --arg query="machine learning" --arg maxResults=3 --prettyUnpaywall is DOI-only and requires an email. CORE public access may return zero results or rate-limit quickly without an API key.
Registry-Backed Platform Search
These metadata-oriented tools are generated from the platform registry, so adding later platforms only needs a new searcher plus registry metadata:
paper-search run search_dblp --arg query="graph neural networks" --arg maxResults=5 --pretty
paper-search run search_acm --arg query="software testing" --arg maxResults=5 --pretty
paper-search run search_usenix --arg query="file systems" --arg maxResults=5 --pretty
paper-search run search_openreview --arg query="large language models" --arg maxResults=5 --pretty
paper-search run search_springerlink --arg query="machine learning" --arg maxResults=5 --prettysearch_ieee uses the same generic schema but requires IEEE_API_KEY:
paper-search run search_ieee --arg query="wireless networks" --arg maxResults=5 --arg articleTitle="wireless" --prettysearch_webofscience
Search Web of Science. Requires WOS_API_KEY.
paper-search run search_webofscience --arg query="CRISPR gene editing" --arg maxResults=15 --arg year=2022 --arg journal=Nature --prettysearch_google_scholar
Search Google Scholar.
paper-search run search_google_scholar --arg query="deep learning" --arg maxResults=10 --arg yearLow=2020 --arg yearHigh=2024 --prettysearch_biorxiv and search_medrxiv
Search preprint servers by recent day window and optional category.
paper-search run search_biorxiv --arg query="genomics" --arg maxResults=10 --arg days=30 --pretty
paper-search run search_medrxiv --arg query="epidemiology" --arg maxResults=10 --arg days=60 --prettysearch_semantic_scholar
Search Semantic Scholar with optional field filters.
paper-search run search_semantic_scholar --json-args '{"query":"graph neural networks","maxResults":10,"fieldsOfStudy":["Computer Science"]}' --prettysearch_semantic_snippets
Search Semantic Scholar's Open Access snippet index for body-text snippets that can help locate methodological details. Requires SEMANTIC_SCHOLAR_API_KEY.
paper-search run search_semantic_snippets --arg query="CMAverse mediation bootstrap confidence interval" --arg limit=5 --arg fieldsOfStudy=Medicine --prettysearch_iacr
Search IACR ePrint Archive.
paper-search run search_iacr --arg query="zero knowledge proof" --arg maxResults=10 --arg fetchDetails=true --prettysearch_sciencedirect
Search ScienceDirect. Requires ELSEVIER_API_KEY.
paper-search run search_sciencedirect --arg query="materials science" --arg maxResults=10 --arg openAccess=true --prettysearch_scopus
Search Scopus. Requires ELSEVIER_API_KEY.
paper-search run search_scopus --arg query="citation analysis" --arg maxResults=10 --arg documentType=ar --prettysearch_springer
Search Springer Nature. Requires SPRINGER_API_KEY.
paper-search run search_springer --arg query="machine learning" --arg maxResults=10 --arg type=Journal --arg openAccess=true --prettysearch_scihub
Lookup a DOI or article URL through Sci-Hub and optionally download a PDF.
paper-search run search_scihub --arg doiOrUrl="10.1038/nature12373" --arg downloadPdf=false --pretty
paper-search run search_scihub --arg doiOrUrl="10.1038/nature12373" --arg downloadPdf=true --arg savePath=./downloads --prettycheck_scihub_mirrors
Show Sci-Hub mirror health.
paper-search run check_scihub_mirrors --pretty
paper-search run check_scihub_mirrors --arg forceCheck=true --prettyget_paper_by_doi
Lookup metadata by DOI.
paper-search run get_paper_by_doi --arg doi="10.1038/nature12373" --arg platform=all --pretty
paper-search run get_paper_by_doi --arg doi="10.1038/nature12373" --arg platform=arxiv --prettydownload_paper
Download PDF files from a platform. If the selected platform has no native downloader, or if native download fails, the command enters the same fallback funnel used by download_with_fallback.
paper-search run download_paper --arg paperId="2301.00001" --arg platform=arxiv --arg savePath=./downloads --prettyNative download platforms:
arxiv, biorxiv, medrxiv, semantic, iacr, scihub, springer, wiley,
pmc, europepmc, coreOther registered sources, such as crossref, openalex, dblp, acm, usenix, or openreview, can still be passed to download_paper; they start directly at the metadata/repository/Unpaywall/Sci-Hub fallback funnel.
download_with_fallback
Try the full download funnel. The order is source-native download, metadata PDF URL, repository discovery, Unpaywall DOI resolution, then Sci-Hub as the final fallback:
paper-search run download_with_fallback --arg source=arxiv --arg paperId=1201.0490 --arg doi=10.48550/arxiv.1201.0490 --arg savePath=./downloads --pretty
paper-search run download_with_fallback --arg source=crossref --arg paperId="10.1038/nature12373" --arg doi="10.1038/nature12373" --arg savePath=./downloads --prettyuseSciHub defaults to true; set it to false only when you need to suppress that final fallback. download_paper also routes failed or unsupported platform downloads through the same funnel.
search_wiley
Wiley keyword search is not supported by the Wiley TDM API. Use Crossref first, then download by DOI:
paper-search run search_crossref --arg query="site:wiley.com machine learning" --arg maxResults=10 --pretty
paper-search run download_paper --arg paperId="10.1111/example" --arg platform=wiley --prettyget_platform_status
Same as paper-search status.
paper-search run get_platform_status --pretty
paper-search run get_platform_status --arg validate=true --prettyTroubleshooting
Command Not Found
Run from the project:
node dist/cli.js status --prettyOr register the local command:
npm link
paper-search status --prettyMissing API Key
Run:
paper-search status --prettyIf a provider shows missing, add the relevant key through paper-search setup, user config, or .env, then rerun the command.
For global installs, prefer user config:
paper-search setup
paper-search config set SEMANTIC_SCHOLAR_API_KEY your_key
paper-search config doctor --prettyProvider Rate Limits
Reduce --max-results, avoid repeated live validation, and prefer sources with official APIs. PubMed, Semantic Scholar, and CORE support optional keys for better limits. CORE anonymous access can return HTTP 429; configure PAPER_SEARCH_CORE_API_KEY when you rely on it.
JSON Parsing In Scripts
Use default JSON output and parse stdout. Human diagnostics are written to stderr.
Usage Boundaries
Some sources may be subject to platform terms, institutional subscriptions, or local law. Use restricted integrations only when you have the appropriate access rights and permission.
Project Origin
This project acknowledges and thanks the LinuxDo community.
The CLI + Skill direction and paper-search workflow refinements were shaped by community discussions and open-source sharing. This repository keeps the workflow focused on a one-command terminal tool and does not require an MCP runtime.
It also references ideas from openags/paper-search-mcp while adapting the workflow to a standalone CLI.
License
MIT
