npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

hf-embedder

v0.2.1

Published

A local text embedding library for Node.js using HuggingFace ONNX models via Transformers.js

Readme

hf-embedder

Local text embedding for Node.js. Runs a HuggingFace ONNX model via Transformers.js — no Python, no external services.

import { Embedder } from 'hf-embedder'

const embedder = await Embedder.create({ model: 'Xenova/multilingual-e5-small' })

const vector = await embedder.embed('hello world')
// => number[384]

const batch = await embedder.embed(['cat', 'dog', 'fish'])
// => number[][3][384]

Installation

npm install hf-embedder

Node.js 20+ (ESM only).

Usage

import { Embedder } from 'hf-embedder'

// Pick any HuggingFace ONNX embedding model
const embedder = await Embedder.create({ model: 'Xenova/multilingual-e5-small' })

// Single string → number[]
const vec = await embedder.embed('your text here')

// Batch → number[][]
const vecs = await embedder.embed(['first', 'second', 'third'])

// Repeated input hits the in-memory cache (FIFO, default 100 entries)
const again = await embedder.embed('your text here')
// same values as vec, returned instantly without inference

Sync API

For environments where async inference is impractical (e.g., some bundlers, scripts), Embedder.createSync() returns a SyncEmbedder that runs the model on a background thread and blocks the calling thread via shared memory:

import { Embedder } from 'hf-embedder'

const embedder = Embedder.createSync({ model: 'Xenova/multilingual-e5-small' })

const vec = embedder.embedSync('hello world')
// => number[384]

const batch = embedder.embedSync(['cat', 'dog', 'fish'])
// => number[][3][384]

The sync variant shares the same model cache (~/.hfembedder/.cache/models/), result cache semantics, and API shape — only the method name changes from embed to embedSync.

import { SyncEmbedder } from 'hf-embedder'

const embedder = new SyncEmbedder({ model: 'Xenova/multilingual-e5-small', cacheSize: 50 })
const vec = embedder.embedSync('hello')

Options

interface EmbedderOptions {
  model?: string          // HF model ID (default: 'onnx-community/Qwen3-Embedding-0.6B-ONNX')
  dtype?: string            // quantization (default: 'q8')
  device?: string           // execution device, e.g. 'cpu', 'cuda', 'wasm'
  pooling?: 'mean' | 'last_token'  // pooling strategy (default: 'mean')
  normalize?: boolean       // L2 normalize output (default: true)
  queue?: boolean           // serialize inference calls (concurrency: 1)
  concurrency?: number      // set a specific concurrency limit
  cacheSize?: number        // in-memory result cache size (default 100, 0 to disable)
}
// Custom model
const e = await Embedder.create({ model: 'other-org/my-embedding-model' })

// Use GPU
const e = await Embedder.create({ device: 'cuda' })

// Serial execution — safe for memory-constrained environments
const e = await Embedder.create({ queue: true })

// Limited parallelism
const e = await Embedder.create({ concurrency: 2 })

// No result caching
const e = await Embedder.create({ cacheSize: 0 })

Default Model

  • Model: onnx-community/Qwen3-Embedding-0.6B-ONNX
  • Quantization: q8
  • Pipeline: feature-extraction with mean pooling + L2 normalization
  • Output dimension: 1024
  • Cache: ~/.hfembedder/.cache/models/ (auto-downloaded on first use)

For a smaller alternative, use Xenova/multilingual-e5-small (384-dim, ~90MB).

Docs

License

MIT