npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@delali/narsil-embeddings-transformers

v0.1.6

Published

Transformers.js embedding adapter for Narsil search engine

Readme

@delali/narsil-embeddings-transformers

A Transformers.js embedding adapter for the Narsil search engine. This adapter runs embedding models directly in Node.js or the browser using ONNX Runtime, so there are no external API calls and no data leaves your environment. It conforms to Narsil's EmbeddingAdapter interface and supports any Hugging Face model that works with the feature-extraction pipeline.

Installation

pnpm add @delali/narsil-embeddings-transformers @huggingface/transformers

@huggingface/transformers is a peer dependency. You must install it alongside this package. Any version >=3.0.0 is supported.

Quick start

import { createTransformersEmbedding } from '@delali/narsil-embeddings-transformers'

const embedding = createTransformersEmbedding({
  dimensions: 384,
})

const vector = await embedding.embed('a red panda eating bamboo', 'document')
console.log(vector.length) // 384

The factory function returns a synchronous adapter object. The underlying model loads lazily on the first embed() or embedBatch() call, and subsequent calls reuse the same pipeline instance.

Choosing a model

The default model is Xenova/all-MiniLM-L6-v2, a 384-dimensional sentence embedding model that works well for general-purpose text similarity. Here are common alternatives:

| Model | Dimensions | Size (q8) | Use case | | ----- | ---------- | --------- | ---------- | | Xenova/all-MiniLM-L6-v2 | 384 | ~23 MB | General-purpose, good balance of speed and quality | | Xenova/bge-base-en-v1.5 | 768 | ~65 MB | Higher quality English embeddings, requires prefix | | Xenova/bge-small-en-v1.5 | 384 | ~23 MB | Smaller BGE variant for English | | Xenova/multilingual-e5-small | 384 | ~50 MB | Multilingual support, requires prefix | | Xenova/gte-small | 384 | ~23 MB | Strong general-purpose alternative |

Set the dimensions config value to match the output dimensionality of your chosen model. If you set it incorrectly, the adapter will throw an error on the first embedding call.

const embedding = createTransformersEmbedding({
  model: 'Xenova/bge-base-en-v1.5',
  dimensions: 768,
})

Browser vs Node.js

All models work in both environments. In the browser, models are downloaded from the Hugging Face Hub and cached in the browser's Cache API. In Node.js, models are cached on disk at ~/.cache/huggingface/. The first call triggers the download; subsequent calls load from cache.

Document and query prefixes

Some models (BGE, E5, and instruction-tuned models) require specific text prefixes for documents and queries. Configure these with documentPrefix and queryPrefix:

const embedding = createTransformersEmbedding({
  model: 'Xenova/bge-base-en-v1.5',
  dimensions: 768,
  documentPrefix: 'Represent this sentence: ',
  queryPrefix: 'Represent this sentence for searching relevant passages: ',
})

For E5 models:

const embedding = createTransformersEmbedding({
  model: 'Xenova/multilingual-e5-small',
  dimensions: 384,
  documentPrefix: 'passage: ',
  queryPrefix: 'query: ',
})

The adapter prepends the appropriate prefix based on the purpose argument ('document' or 'query') passed to embed() and embedBatch().

Device and quantization

Control where inference runs and at what precision:

const embedding = createTransformersEmbedding({
  dimensions: 384,
  device: 'webgpu',  // 'wasm' | 'webgpu' | 'cpu'
  dtype: 'q8',       // 'q8' | 'q4' | 'fp32' | 'fp16'
})
  • device: Defaults to auto-detection by Transformers.js. Use 'webgpu' for GPU acceleration in supported browsers. Use 'cpu' for Node.js environments.
  • dtype: Defaults to 'q8' (8-bit quantization). Lower precision like 'q4' reduces model size and speeds up inference at a small quality cost. Use 'fp32' for full-precision inference when accuracy matters more than speed.

Download progress

Track model download progress for a better loading experience:

const embedding = createTransformersEmbedding({
  dimensions: 384,
  progress: (data) => {
    console.log('Download progress:', data)
  },
})

The progress callback is passed through to the Transformers.js pipeline() function as progress_callback. The callback data includes status, file name, and download percentage when available.

Integration with Narsil

The adapter plugs into Narsil's embedding configuration for automatic vector generation on insert and text-based vector search on query:

import { createNarsil } from '@delali/narsil'
import { createTransformersEmbedding } from '@delali/narsil-embeddings-transformers'

const embeddingAdapter = createTransformersEmbedding({
  dimensions: 384,
})

const narsil = createNarsil({
  embedding: embeddingAdapter,
})

const index = await narsil.createIndex({
  name: 'articles',
  schema: {
    title: 'string',
    body: 'string',
    titleVector: 'vector[384]',
  },
  embedding: {
    fields: {
      titleVector: ['title', 'body'],
    },
  },
})

await index.insert({
  title: 'Introduction to Vector Search',
  body: 'Vector search finds similar items by comparing numerical representations...',
})

const results = await index.query({
  vector: {
    field: 'titleVector',
    text: 'how does semantic search work',
    limit: 10,
  },
})

When you insert a document, Narsil automatically generates embeddings for the titleVector field by concatenating the title and body source fields and passing them through the adapter. When you query with text instead of a raw vector, Narsil embeds the query text using the same adapter with the 'query' purpose.

Shutdown and cleanup

Release the ONNX session and free memory by calling shutdown():

await embeddingAdapter.shutdown()

If the adapter is passed to a Narsil instance, calling narsil.shutdown() will shut down the embedding adapter automatically.

API reference

createTransformersEmbedding(config)

Returns an object conforming to Narsil's EmbeddingAdapter interface.

Config options

| Option | Type | Default | Description | | ------ | ---- | ------- | ----------- | | dimensions | number | (required) | Output dimensionality of the model. Must match the model's actual output size. | | model | string | 'Xenova/all-MiniLM-L6-v2' | Hugging Face model identifier for the feature-extraction pipeline. | | dtype | string | 'q8' | Model quantization level: 'q8', 'q4', 'fp32', 'fp16'. | | device | 'wasm' \| 'webgpu' \| 'cpu' | auto-detect | Inference backend. Omit to let Transformers.js pick the best available. | | pooling | 'mean' \| 'cls' | 'mean' | Token pooling strategy for generating a single vector from token-level outputs. | | normalize | boolean | true | Whether to L2-normalize output vectors. | | documentPrefix | string | '' | Text prepended to input when purpose is 'document'. | | queryPrefix | string | '' | Text prepended to input when purpose is 'query'. | | progress | (data: unknown) => void | - | Callback for model download progress events. | | pipelineOptions | Record<string, unknown> | - | Additional options passed through to the Transformers.js pipeline() constructor. |

Returned adapter methods

| Method | Signature | Description | | ------ | --------- | ----------- | | embed | (input: string, purpose: 'document' \| 'query', signal?: AbortSignal) => Promise<Float32Array> | Embed a single string. | | embedBatch | (inputs: string[], purpose: 'document' \| 'query', signal?: AbortSignal) => Promise<Float32Array[]> | Embed multiple strings in a single model forward pass. | | dimensions | readonly number | The configured output dimensionality. | | shutdown | () => Promise<void> | Release the model pipeline and free resources. |

License

Apache-2.0