npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@ncbijs/pubtator

v0.0.0

Published

Text mining client for PubTator/BioC APIs with entity annotations and relation discovery

Readme


Why

PubTator3 has over 1 billion entity annotations across 36 million PubMed articles and 6 million PMC full-text articles. It identifies genes, diseases, chemicals, mutations, species, and cell lines. But the API has multiple layers (entity autocomplete, publication export) with different response formats.

@ncbijs/pubtator wraps them into a typed, promise-based client.

  • Entity search — autocomplete entities by name with optional type filter
  • Publication search — search PubTator-indexed publications
  • BioC export — export annotations in BioC XML or JSON
  • Free-text annotation — annotate arbitrary text with entity recognition
  • TSV parsing — parse PubTator tab-separated annotation format

Install

npm install @ncbijs/pubtator

Quick start

import { PubTator } from '@ncbijs/pubtator';

const pubtator = new PubTator();

// Search for gene entities
const genes = await pubtator.findEntity('BRCA1', 'gene');
console.log(genes[0].name); // "BRCA1"

// Export BioC annotations for PubMed articles
const bioc = await pubtator.export(['33024307', '32919527']);
for (const doc of bioc.documents) {
  for (const passage of doc.passages) {
    console.log(passage.annotations);
  }
}

API

new PubTator()

Creates a new PubTator3 client. No configuration required.

findEntity(query, entityType?)

Search entities by name via the PubTator3 autocomplete API.

const results = await pubtator.findEntity('aspirin', 'chemical');

| Parameter | Type | Required | Description | | ------------ | ------------ | -------- | ---------------------------------- | | query | string | Yes | Entity name or partial name. | | entityType | EntityType | No | Filter by entity type (see below). |

Returns Promise<ReadonlyArray<EntityMatch>>.

EntityType values

| Constant | API value | | ----------------------- | ------------- | | ENTITY_TYPES.Gene | 'gene' | | ENTITY_TYPES.Disease | 'disease' | | ENTITY_TYPES.Chemical | 'chemical' | | ENTITY_TYPES.Variant | 'variant' | | ENTITY_TYPES.Species | 'species' | | ENTITY_TYPES.CellLine | 'cell_line' |

search(query, options?)

Search PubTator-indexed publications by text.

const results = await pubtator.search('BRCA1 breast cancer', { page: 1, pageSize: 10 });
console.log(results.total);

| Parameter | Type | Required | Description | | --------- | --------------- | -------- | ------------ | | query | string | Yes | Search text. | | options | SearchOptions | No | Pagination. |

SearchOptions

| Option | Type | Default | Description | | ---------- | -------- | ------- | ----------------- | | page | number | -- | Page number. | | pageSize | number | -- | Results per page. |

Returns Promise<SearchResult>.

export(pmids, options?)

Export BioC annotations for a list of PMIDs.

const bioc = await pubtator.export(['33024307'], { format: 'xml', full: true });

| Parameter | Type | Required | Description | | --------- | ----------------------- | -------- | ----------------------------------- | | pmids | ReadonlyArray<string> | Yes | PubMed IDs to export. | | options | ExportOptions | No | Format and full-text configuration. |

ExportOptions

| Option | Type | Default | Description | | -------- | ------------------- | -------- | --------------------------------------------- | | format | 'json' | 'xml' | 'json' | BioC output format. | | full | boolean | -- | Include full-text annotations when available. |

Returns Promise<BioDocument>.

annotateByPmid(pmids, options?)

Annotate articles by their PubMed IDs.

const annotations = await pubtator.annotateByPmid(['33024307'], {
  concept: 'Gene',
  format: 'PubTator',
});

| Parameter | Type | Required | Description | | --------- | ----------------------- | -------- | -------------------------- | | pmids | ReadonlyArray<string> | Yes | PubMed IDs to annotate. | | options | AnnotateOptions | No | Concept filter and format. |

AnnotateOptions

| Option | Type | Default | Description | | --------- | ------------------------------------ | ------- | ---------------------------------------------- | | concept | ConceptType | -- | Filter to a specific concept type (see below). | | format | 'PubTator' | 'BioC' | 'JSON' | -- | Output format. |

Returns Promise<string>.

ConceptType values

| Constant | API value | | -------------------------- | -------------- | | CONCEPT_TYPES.Gene | 'Gene' | | CONCEPT_TYPES.Disease | 'Disease' | | CONCEPT_TYPES.Chemical | 'Chemical' | | CONCEPT_TYPES.Mutation | 'Mutation' | | CONCEPT_TYPES.Species | 'Species' | | CONCEPT_TYPES.BioConcept | 'BioConcept' |

annotateText(text, options?)

Annotate free text with entity recognition.

const annotated = await pubtator.annotateText(
  'BRCA1 is associated with breast cancer susceptibility.',
  { concept: 'Disease' },
);

| Parameter | Type | Required | Description | | --------- | ----------------- | -------- | -------------------------- | | text | string | Yes | Free text to annotate. | | options | AnnotateOptions | No | Concept filter and format. |

Returns Promise<string>.

parseBioC(input)

Parse a BioC XML or JSON string into a typed BioDocument.

import { parseBioC } from '@ncbijs/pubtator';

const bioc = parseBioC(xmlString);

| Parameter | Type | Required | Description | | --------- | -------- | -------- | --------------------------------- | | input | string | Yes | BioC XML or JSON string to parse. |

Returns BioDocument.

parsePubTatorTsv(input)

Parse PubTator tab-separated annotation format.

import { parsePubTatorTsv } from '@ncbijs/pubtator';

const annotations = parsePubTatorTsv(tsvString);

| Parameter | Type | Required | Description | | --------- | -------- | -------- | ----------------------------- | | input | string | Yes | PubTator TSV string to parse. |

Returns ReadonlyArray<PubTatorAnnotation>.

Types

All types are exported for use in your own interfaces:

import type {
  AnnotateOptions,
  Annotation,
  BioDocument,
  BioPassage,
  ConceptType,
  EntityMatch,
  EntityType,
  ExportOptions,
  PubTatorAnnotation,
  SearchOptions,
  SearchResult,
} from '@ncbijs/pubtator';