@lsbjordao/genbank-js
v1.0.2
Published
NCBI GenBank / E-utilities client for Node.js
Downloads
441
Maintainers
Readme
@lsbjordao/genbank-js
Node.js client for the NCBI GenBank E-utilities and database-backed BLAST workflows. Handles throttling, retries, exponential backoff, and pagination automatically.
Installation
npm install @lsbjordao/genbank-jsRequires Node.js >= 18.
Quick start
import { GenbankService } from '@lsbjordao/genbank-js';
const genbank = new GenbankService({
tool: 'my-app',
email: '[email protected]',
apiKey: process.env.NCBI_API_KEY, // optional — raises limit to 10 req/s
});CommonJS
const { GenbankService } = require('@lsbjordao/genbank-js');Constructor options
| Option | Type | Default | Description |
|---|---|---|---|
| tool | string | required | Name of your application (sent to NCBI) |
| email | string | required | Your email (sent to NCBI) |
| apiKey | string | — | NCBI API key — raises rate limit from 3 to 10 req/s |
| baseUrl | string | NCBI E-utils URL | Override the base URL |
| timeoutMs | number | 30000 | Request timeout in milliseconds |
| maxRetries | number | 4 | Max retry attempts on 429/5xx |
| requestsPerSecond | number | 3 or 10 with API key | Rate limit |
Sub-services
Each service is accessible as a property of GenbankService:
| Property | Service | Purpose |
|---|---|---|
| genbank.search | EsearchService | Search IDs via Entrez |
| genbank.fetch | EfetchService | Retrieve sequences and records |
| genbank.summary | EsummaryService | Fetch document summaries |
| genbank.link | ElinkService | Cross-database links |
| genbank.info | EinfoService | Database metadata and field info |
| genbank.blast | BlastService | BLAST searches |
Most methods are also available as top-level delegates on GenbankService for convenience.
ESearch — search for IDs
// Raw esearch — returns count, idlist, WebEnv, query_key
const result = await genbank.esearch({
db: 'nuccore',
term: 'Mimosa[Organism] AND ITS',
retmax: 50,
});
console.log(result.idlist); // ['MZ934621.1', ...]
// Shorthand — returns only the accession list
const accessions = await genbank.searchAccessions('Mimosa[Organism] AND ITS', 100);
// Store on NCBI history server — use with large result sets
const { count, webenv, queryKey } = await genbank.searchWithHistory(
'Mimosa[Organism] AND ITS',
);
console.log(`${count} records found`);ESearch params
| Param | Type | Description |
|---|---|---|
| db | EutilDb | Database: nuccore, protein, taxonomy, gene, pubmed, ... |
| term | string | Entrez query string |
| retmax | number | Max IDs to return (default 20) |
| retstart | number | Offset for pagination |
| sort | string | Sort order |
| idtype | string | 'acc' for accession numbers |
| usehistory | 'y' \| 'n' | Store results on history server |
| datetype | string | Date field for filtering |
| mindate / maxdate | string | Date range (YYYY/MM/DD) |
EFetch — retrieve sequences and records
// Fetch FASTA by accession(s)
const fasta = await genbank.fetchFastaByAccessions(['MZ934621.1', 'MZ934622.1']);
// Fetch GenBank flat file
const gb = await genbank.fetchGenbankFlatfile('MZ934621.1');
// Raw efetch — full control over rettype/retmode
const xml = await genbank.efetchText({
db: 'nuccore',
id: ['MZ934621.1'],
rettype: 'gb',
retmode: 'xml',
});
// Stream FASTA in batches from a history query (avoids loading all IDs into memory)
const { webenv, queryKey } = await genbank.searchWithHistory('Mimosa[Organism] AND ITS');
const allFasta = await genbank.fetchFastaFromHistory(webenv, queryKey, 500);EFetch params
| Param | Type | Description |
|---|---|---|
| db | EutilDb | Database |
| id | string \| string[] | Accession(s) or UID(s) |
| webenv | string | WebEnv token (from history search) |
| queryKey | string | query_key (from history search) |
| rettype | string | fasta, gb, gp, fasta_cds_aa, ... |
| retmode | string | text, xml, json |
| strand | 1 \| 2 | Strand (1 = plus, 2 = minus) |
| seqStart | number | Start position (1-based) |
| seqStop | number | Stop position |
ESummary — document summaries
// Fetch metadata for a list of IDs
const summary = await genbank.esummary({
db: 'nuccore',
id: ['MZ934621.1', 'MZ934622.1'],
version: '2.0',
});
// Paginate through a history-server query
const page = await genbank.esummaryFromHistory('nuccore', webenv, queryKey, 0, 500);ESummary params
| Param | Type | Description |
|---|---|---|
| db | EutilDb | Database |
| id | string \| string[] | Accession(s) or UID(s) |
| retstart | number | Offset |
| retmax | number | Page size |
| version | string | '2.0' recommended |
ELink — cross-database links
// Raw elink — full control
const linkResult = await genbank.elink({
dbfrom: 'nuccore',
db: 'taxonomy',
id: ['MZ934621.1'],
linkname: 'nuccore_taxonomy',
});
// Convenience helpers
const taxIds = await genbank.linkToTaxonomy(['MZ934621.1', 'MZ934622.1']);
const geneIds = await genbank.linkToGene(['MZ934621.1']);
const pubmedIds = await genbank.linkToPubmed(['MZ934621.1']);
const pdbIds = await genbank.linkProteinToStructure(['NP_001361218.1']); // protein → PDB
// Or via sub-service directly
const taxIds = await genbank.link.linkToTaxonomy(['MZ934621.1']);ELink params
| Param | Type | Description |
|---|---|---|
| dbfrom | EutilDb | Source database |
| db | EutilDb | Target database |
| id | string \| string[] | Source IDs |
| cmd | ELinkCmd | neighbor (default), neighbor_score, acheck, llinks, ... |
| linkname | string | Specific link name, e.g. nuccore_taxonomy |
| webenv / queryKey | string | History server tokens |
EInfo — database metadata
// List all available NCBI databases
const dbs = await genbank.databases();
// ['pubmed', 'protein', 'nuccore', 'taxonomy', 'gene', ...]
// Full metadata for a database
const meta = await genbank.dbinfo('nuccore');
console.log(meta.count); // total records
console.log(meta.lastupdate); // last update timestamp
// Searchable fields — useful for building Entrez queries
const fields = await genbank.info.fields('nuccore');
// [{ name: 'ORGN', fullname: 'Organism', ... }, ...]
// Available cross-database links — useful to know valid ELink linknames
const links = await genbank.info.links('nuccore');
// [{ name: 'nuccore_taxonomy', dbto: 'taxonomy', ... }, ...]EInfo result types
interface EInfoDbInfo {
dbname: string;
menuname: string;
description: string;
dbbuild: string;
count: string;
lastupdate: string;
fieldlist: EInfoField[];
linklist: EInfoLink[];
}
interface EInfoField {
name: string; // e.g. 'ORGN'
fullname: string; // e.g. 'Organism'
description: string;
termcount: string;
isdate: 'Y' | 'N';
isnumerical: 'Y' | 'N';
singletoken: 'Y' | 'N';
hierarchy: 'Y' | 'N';
ishidden: 'Y' | 'N';
istruncatable: 'Y' | 'N';
israngable: 'Y' | 'N';
}
interface EInfoLink {
name: string; // e.g. 'nuccore_taxonomy'
menu: string;
description: string;
dbto: string; // e.g. 'taxonomy'
}BLAST
The BLAST client is accessible via genbank.blast and supports database-backed searches through the NCBI Common URL API.
Note:
submitSearch(query, database)expects a raw sequence or FASTA string asquery— not an accession. If you only have an accession, fetch the FASTA first withfetchFastaByAccessions().
Limitation: Only database-backed searches are supported. For true pairwise query-vs-subject comparisons (
bl2seq), use local BLAST+ instead.
Database search
// 1. Fetch the FASTA sequence from an accession
const fasta = await genbank.fetchFastaByAccessions('MZ934621.1');
// 2. Submit BLAST search against a remote database
const { rid, estimatedSeconds } = await genbank.blast.submitSearch(fasta, 'nt', {
program: 'blastn',
megablast: true,
});
// 3. Wait the estimated time, then poll
await new Promise((r) => setTimeout(r, estimatedSeconds * 1000));
await genbank.blast.pollUntilReady(rid);
// 4. Retrieve results
const result = await genbank.blast.getResult(rid);
for (const search of result.results) {
for (const hit of search.hits) {
const top = hit.hsps[0];
console.log(hit.accession, top.identityPct, top.evalue);
}
}BLAST result structure
interface BlastResult {
rid: string;
results: Array<{
queryTitle: string;
queryLen: number | null;
hits: Array<{
id: string;
title: string;
accession: string;
subjectLen: number;
hsps: Array<{
bitScore: number;
score: number;
evalue: number;
identity: number;
alignLen: number;
gaps: number;
identityPct: number | null; // (identity / alignLen) * 100
gapPct: number | null;
queryFrom: number;
queryTo: number;
hitFrom: number;
hitTo: number;
}>;
}>;
}>;
}BLAST options
| Option | Type | Default | Description |
|---|---|---|---|
| program | BlastProgram | 'blastn' | blastn, blastp, blastx, tblastn, tblastx |
| megablast | boolean | true | Enable megablast for blastn |
| pollIntervalMs | number | 10000 | Polling interval in ms |
| maxWaitMs | number | 600000 | Max total wait time in ms |
Using sub-services independently
Each service can be imported and instantiated on its own:
import { HttpClient, EsearchService, EfetchService, EinfoService } from '@lsbjordao/genbank-js';
const client = new HttpClient({ tool: 'my-app', email: '[email protected]' });
const search = new EsearchService(client);
const fetch = new EfetchService(client);
const info = new EinfoService(client);Rate limiting
NCBI enforces:
- 3 requests/second without an API key
- 10 requests/second with an API key
The client throttles automatically. Get a free API key at https://www.ncbi.nlm.nih.gov/account/.
License
MIT
