@ncbijs/datasets
v0.0.0
Published
Typed client for the NCBI Datasets API v2 (genes, genomes, taxonomy)
Downloads
124
Maintainers
Readme
@ncbijs/datasets
Typed client for the NCBI Datasets API v2. Access genes, genomes, taxonomy, viruses, BioProjects, and BioSamples with zero XML parsing.
Installation
npm install @ncbijs/datasetsUsage
import { Datasets } from '@ncbijs/datasets';
const datasets = new Datasets({ apiKey: process.env.NCBI_API_KEY });
const genes = await datasets.geneById([672, 7157]);
console.log(genes[0].symbol); // 'BRCA1'
console.log(genes[0].description); // 'BRCA1 DNA repair associated'
const taxonomy = await datasets.taxonomy([9606]);
console.log(taxonomy[0].organismName); // 'Homo sapiens'
console.log(taxonomy[0].rank); // 'species'
const genomes = await datasets.genomeByAccession(['GCF_000001405.40']);
console.log(genomes[0].assemblyInfo.assemblyName); // 'GRCh38.p14'API
new Datasets(config?)
| Option | Default | Description |
| ------------ | ------- | --------------------------------------------------- |
| apiKey | -- | NCBI API key (raises rate limit from 5 to 10 req/s) |
| maxRetries | 3 | Number of retries on 429/5xx errors |
Gene
geneById(geneIds: Array<number>): Promise<Array<GeneReport>>
Fetch gene metadata by NCBI Gene IDs.
geneBySymbol(symbols: Array<string>, taxon: number | string): Promise<Array<GeneReport>>
Fetch gene metadata by gene symbol and taxon (ID or name).
Taxonomy
taxonomy(taxons: Array<number | string>): Promise<Array<TaxonomyReport>>
Fetch taxonomy data by taxon IDs or names.
Genome
genomeByAccession(accessions: Array<string>): Promise<Array<GenomeReport>>
Fetch genome assembly reports by accession (e.g., GCF_000001405.40).
genomeByTaxon(taxon: number | string): Promise<Array<GenomeReport>>
Fetch genome assembly reports for all assemblies of a taxon.
Virus
virusByAccession(accessions: Array<string>): Promise<Array<VirusReport>>
Fetch virus genome reports by accessions.
virusByTaxon(taxon: number | string): Promise<Array<VirusReport>>
Fetch virus genome reports for all viruses of a taxon.
BioProject
bioproject(accessions: Array<string>): Promise<Array<BioProjectReport>>
Fetch BioProject reports by accessions (e.g., PRJNA12345).
BioSample
biosample(accessions: Array<string>): Promise<Array<BioSampleReport>>
Fetch BioSample reports by accessions (e.g., SAMN12345).
Assembly
assemblyDescriptors(accessions: Array<string>): Promise<Array<AssemblyDescriptor>>
Fetch lightweight assembly descriptors by accession numbers.
Gene links
geneLinks(geneIds: Array<number>): Promise<Array<GeneLink>>
Fetch external database links for genes by NCBI Gene IDs.
Catalog
datasetCatalog(): Promise<Array<DatasetInfo>>
List available NCBI datasets from the catalog.
Error handling
import { Datasets, DatasetsHttpError } from '@ncbijs/datasets';
try {
await datasets.geneById([672]);
} catch (err) {
if (err instanceof DatasetsHttpError) {
console.error(`HTTP ${err.status}: ${err.body}`);
}
}The client automatically retries on HTTP 429, 500, 502, 503 and network errors with exponential backoff + jitter.
Response types
GeneReport
interface GeneReport {
geneId: number;
symbol: string;
description: string;
taxId: number;
taxName: string;
commonName: string;
type: string;
chromosomes: Array<string>;
synonyms: Array<string>;
swissProtAccessions: Array<string>;
ensemblGeneIds: Array<string>;
omimIds: Array<string>;
summary: string;
transcriptCount: number;
proteinCount: number;
geneOntology: GeneOntology;
}TaxonomyReport
interface TaxonomyReport {
taxId: number;
organismName: string;
commonName: string;
rank: string;
lineage: Array<number>;
children: Array<number>;
counts: Array<TaxonomyCount>;
}GenomeReport
interface GenomeReport {
accession: string;
currentAccession: string;
sourceDatabase: string;
organism: GenomeOrganism;
assemblyInfo: AssemblyInfo;
assemblyStats: AssemblyStats;
}VirusReport
interface VirusReport {
accession: string;
taxId: number;
organismName: string;
isolateName: string;
host: string;
collectionDate: string;
geoLocation: string;
completeness: string;
length: number;
bioprojectAccession: string;
biosampleAccession: string;
}BioProjectReport
interface BioProjectReport {
accession: string;
title: string;
description: string;
organismName: string;
taxId: number;
projectType: string;
registrationDate: string;
}BioSampleReport
interface BioSampleReport {
accession: string;
title: string;
description: string;
organismName: string;
taxId: number;
ownerName: string;
submissionDate: string;
publicationDate: string;
attributes: Array<BioSampleAttribute>;
}BioSampleAttribute
interface BioSampleAttribute {
name: string;
value: string;
}AssemblyDescriptor
interface AssemblyDescriptor {
accession: string;
assemblyName: string;
assemblyLevel: string;
organism: string;
taxId: number;
submitter: string;
releaseDate: string;
}GeneLink
interface GeneLink {
geneId: number;
links: Array<ExternalLink>;
}ExternalLink
interface ExternalLink {
resourceName: string;
url: string;
}DatasetInfo
interface DatasetInfo {
name: string;
description: string;
version: string;
}