npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@oxdev03/node-tantivy-binding

v0.2.0

Published

Node.js bindings for Tantivy. Provides indexing, querying, and advanced search features with TypeScript support.

Downloads

816

Readme

node-tantivy-binding

License: MIT

Node.js bindings for Tantivy, the full-text search engine library written in Rust.

This project is a Node.js port of tantivy-py, providing JavaScript/TypeScript bindings for the Tantivy search engine. The implementation closely follows the Python API to maintain consistency across language bindings.

⚠️ Note: This is a first draft implementation ported from tantivy-py (see submodule hash). There may be unidentified bugs. The test suite is also based on tantivy-py. Furthermore not all future api changes will be reflected in this binding.

Installation

The bindings can be installed using npm:

npm install @oxdev03-org/node-tantivy-binding

If no binary is present for your operating system, the bindings will be built from source, which requires Rust to be installed.

Quick Start

For more detailed examples, see the tutorials.

import { SchemaBuilder, FieldType, Index, Document } from '@oxdev03-org/node-tantivy-binding'

// Create a schema
const schema = new SchemaBuilder()
  .addTextField('title', { stored: true })
  .addTextField('body', { stored: true })
  .build()

// Create an index
const index = new Index(schema)
const writer = index.writer()

// Add documents
const doc1 = new Document()
doc1.addText('title', 'The Old Man and the Sea')
doc1.addText('body', 'He was an old man who fished alone in a skiff in the Gulf Stream.')
writer.addDocument(doc1)
writer.commit()

// Search
const searcher = index.searcher()
const query = index.parseQuery('sea', ['title', 'body'])
const results = searcher.search(query, 10)

console.log('Found', results.hits.length, 'results')

Features

This Node.js binding provides access to most of Tantivy's functionality:

  • Full-text search with BM25 scoring
  • Structured queries with boolean operations
  • Faceted search for filtering and aggregation
  • Snippet generation for search result highlighting
  • Query explanation for debugging relevance scoring
  • Multiple field types: text, integers, floats, dates, facets
  • Flexible tokenization and text analysis
  • JSON document support

API Compatibility

The API closely follows tantivy-py to maintain consistency:

  • Same class names and method signatures where possible
  • Compatible document and query structures
  • Equivalent search result formats
  • Similar configuration options

Development

Requirements

  • Install the latest Rust (required for building from source)
  • Install Node.js@22+ which fully supports Node-API
  • Install yarn

Building from Source

# Clone the repository
git clone <repository-url>
cd node-tantivy-binding-binding

# Install dependencies
npm install

# Build the native module
npm run build

# Run tests
npm test

Testing

The project includes a comprehensive test suite migrated from tantivy-py:

npm test

Project Status

This is a first draft port of tantivy-py to Node.js. While the core functionality works, please be aware:

  • ⚠️ Potential bugs: Some edge cases may not be handled correctly
  • 🔄 API changes: The API may evolve in future versions

Known Implementation Differences & TODOs

The Node.js implementation currently differs from the Python version in several ways. These are documented TODOs for future improvement:

🔴 Critical Validation Issues

Numeric Field Validation (Too Lenient)

Current behavior: Node.js version accepts invalid values that Python rejects TODO: Implement strict validation to match Python behavior

// ❌ These currently PASS in Node.js but should FAIL:
Document.fromDict({ unsigned: -50 }, schema) // Should reject negative for unsigned
Document.fromDict({ signed: 50.4 }, schema) // Should reject float for integer
Document.fromDict({ unsigned: [1000, -50] }, schema) // Should reject arrays for single fields
Bytes Field Validation (Too Restrictive)

Current behavior: Only accepts Buffer objects TODO: Support byte arrays like Python version

// ❌ These currently FAIL in Node.js but should PASS:
Document.fromDict({ bytes: [1, 2, 3] }, schema) // Should accept byte arrays
Document.fromDict(
  {
    bytes: [
      [1, 2, 3],
      [4, 5, 6],
    ],
  },
  schema,
) // Should accept nested arrays
JSON Field Validation (Too Lenient)

Current behavior: Accepts primitive types for JSON fields
TODO: Restrict to objects/arrays only

// ❌ These currently PASS in Node.js but should FAIL:
Document.fromDict({ json: 123 }, schema) // Should reject numbers
Document.fromDict({ json: 'hello' }, schema) // Should reject strings

🟠 Error Handling Differences

Fast Field Configuration

Current: Throws exception when field not configured as fast Python: Returns empty results TODO: Decide on consistent error handling approach

Query Parser Errors

Current: Different error message formats TODO: Align error messages with Python version

🔵 Type System Differences

Date Handling

Current: Uses getTime() timestamps Python: Uses datetime objects TODO: Consider more intuitive date API

Architecture

Built with:

  • napi-rs: For Node.js ↔ Rust bindings
  • Tantivy: The underlying search engine
  • TypeScript: Full type definitions included
  • Vitest: For testing

Acknowledgments

This project is heavily inspired by and based on:

License

MIT License - see LICENSE file for details.