npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@localmode/transformers

v2.0.0

Published

Transformers.js provider for @localmode - implements all ML model interfaces

Readme

@localmode/transformers

HuggingFace Transformers.js provider for LocalMode — run ML models locally in the browser.

npm license

Docs Demo

Features

  • Browser-Native - Run ML models directly in the browser with WebGPU/WASM
  • Privacy-First - All processing happens locally, no data leaves the device
  • Model Caching - Models are cached in IndexedDB for instant subsequent loads
  • Optimized - Uses quantized models for smaller size and faster inference

Installation

pnpm install @localmode/transformers @localmode/core

Overview

@localmode/transformers provides model implementations for the interfaces defined in @localmode/core. It wraps HuggingFace Transformers.js to enable local ML inference in the browser.


Provider API

All models are created via the transformers provider object. Each factory method returns a model implementing a @localmode/core interface.

Embeddings — Docs

import { embed, embedMany } from '@localmode/core';
import { transformers } from '@localmode/transformers';

const embeddingModel = transformers.embedding('Xenova/bge-small-en-v1.5');

const { embedding } = await embed({ model: embeddingModel, value: 'Hello world' });
const { embeddings } = await embedMany({ model: embeddingModel, values: ['Hello', 'World'] });

| Method | Interface | Description | | ------ | --------- | ----------- | | transformers.embedding(modelId) | EmbeddingModel | Text embeddings |

Recommended Models:

  • Xenova/all-MiniLM-L6-v2 - Fast, general-purpose (~22MB)
  • Xenova/paraphrase-multilingual-MiniLM-L12-v2 - 50+ languages

Multimodal Embeddings (CLIP/SigLIP) — Docs

Embed both text and images into the same vector space for cross-modal search.

import { embed, embedImage, cosineSimilarity } from '@localmode/core';
import { transformers } from '@localmode/transformers';

const model = transformers.multimodalEmbedding('Xenova/clip-vit-base-patch32');

// Text embedding
const { embedding: textVec } = await embed({ model, value: 'a photo of a cat' });

// Image embedding (same vector space)
const { embedding: imgVec } = await embedImage({ model, image: catImageBlob });

// Cross-modal similarity
const similarity = cosineSimilarity(textVec, imgVec);

| Method | Interface | Description | | -------------------------------------------- | ---------------------------- | ------------------------ | | transformers.multimodalEmbedding(modelId) | MultimodalEmbeddingModel | Text + image embeddings |

Recommended Models:

  • Xenova/clip-vit-base-patch32 - Fast, 512 dimensions
  • Xenova/clip-vit-base-patch16 - Better accuracy, 512 dimensions

Reranking — Docs

import { rerank } from '@localmode/core';
import { transformers } from '@localmode/transformers';

const rerankerModel = transformers.reranker('Xenova/ms-marco-MiniLM-L-6-v2');

const { results } = await rerank({
  model: rerankerModel,
  query: 'What is machine learning?',
  documents: ['ML is a subset of AI...', 'Python is a language...'],
  topK: 5,
});

| Method | Interface | Description | | ------ | --------- | ----------- | | transformers.reranker(modelId) | RerankerModel | Document reranking |

Classification & NLP — Docs

import { classify, extractEntities } from '@localmode/core';
import { transformers } from '@localmode/transformers';

const sentiment = await classify({
  model: transformers.classifier('Xenova/distilbert-base-uncased-finetuned-sst-2-english'),
  text: 'I love this product!',
});

const entities = await extractEntities({
  model: transformers.ner('Xenova/bert-base-NER'),
  text: 'John works at Microsoft in Seattle',
});

| Method | Interface | Description | | ------ | --------- | ----------- | | transformers.classifier(modelId) | ClassificationModel | Text classification | | transformers.zeroShot(modelId) | ZeroShotClassificationModel | Zero-shot text classification | | transformers.ner(modelId) | NERModel | Named Entity Recognition |

Translation & Summarization

| Method | Interface | Description | Docs | | ------ | --------- | ----------- | ---- | | transformers.translator(modelId) | TranslationModel | Text translation | Docs | | transformers.summarizer(modelId) | SummarizationModel | Text summarization | Docs | | transformers.fillMask(modelId) | FillMaskModel | Masked token prediction | Docs | | transformers.questionAnswering(modelId) | QuestionAnsweringModel | Extractive QA | Docs |

Audio

import { transcribe, synthesizeSpeech } from '@localmode/core';
import { transformers } from '@localmode/transformers';

const transcription = await transcribe({
  model: transformers.speechToText('onnx-community/moonshine-tiny-ONNX'),
  audio: audioBlob,
  returnTimestamps: true,
});

const { audio, sampleRate } = await synthesizeSpeech({
  model: transformers.textToSpeech('onnx-community/Kokoro-82M-v1.0-ONNX'),
  text: 'Hello, how are you?',
});

| Method | Interface | Description | Docs | | ------ | --------- | ----------- | ---- | | transformers.speechToText(modelId) | SpeechToTextModel | Speech-to-text transcription | Docs | | transformers.textToSpeech(modelId) | TextToSpeechModel | Text-to-speech synthesis | Docs | | transformers.audioClassifier(modelId) | AudioClassificationModel | Audio classification | | | transformers.zeroShotAudioClassifier(modelId) | ZeroShotAudioClassificationModel | Zero-shot audio classification | |

Vision

import { classifyImage, captionImage } from '@localmode/core';
import { transformers } from '@localmode/transformers';

const classification = await classifyImage({
  model: transformers.imageClassifier('Xenova/vit-base-patch16-224'),
  image: imageBlob,
});

const caption = await captionImage({
  model: transformers.captioner('onnx-community/Florence-2-base-ft'),
  image: imageBlob,
});

| Method | Interface | Description | Docs | | ------ | --------- | ----------- | ---- | | transformers.imageClassifier(modelId) | ImageClassificationModel | Image classification | Docs | | transformers.zeroShotImageClassifier(modelId) | ZeroShotImageClassificationModel | Zero-shot image classification | Docs | | transformers.captioner(modelId) | ImageCaptionModel | Image captioning | Docs | | transformers.segmenter(modelId) | SegmentationModel | Image segmentation | Docs | | transformers.objectDetector(modelId) | ObjectDetectionModel | Object detection | Docs | | transformers.imageFeatures(modelId) | ImageFeatureModel | Image feature extraction | Docs | | transformers.imageToImage(modelId) | ImageToImageModel | Image super resolution | Docs | | transformers.depthEstimator(modelId) | DepthEstimationModel | Monocular depth estimation | |

OCR & Document QA

| Method | Interface | Description | Docs | | ------ | --------- | ----------- | ---- | | transformers.ocr(modelId) | OCRModel | OCR (TrOCR) | Docs | | transformers.documentQA(modelId) | DocumentQAModel | Document/Table question answering | Docs |

Text Generation (Experimental) — Docs

Experimental: Uses Transformers.js v4 (preview release). The API may change.

Run ONNX-format language models in the browser with WebGPU acceleration:

import { generateText, streamText } from '@localmode/core';
import { transformers } from '@localmode/transformers';

const model = transformers.languageModel('onnx-community/Qwen3.5-0.8B-ONNX');

// Single-shot generation
const { text } = await generateText({ model, prompt: 'What is 2+2?' });

// Streaming generation
const result = await streamText({ model, prompt: 'Write a haiku' });
for await (const chunk of result.stream) {
  process.stdout.write(chunk.text);
}

| Method | Interface | Description | | ------ | --------- | ----------- | | transformers.languageModel(modelId) | LanguageModel | Text generation (ONNX, WebGPU/WASM) |

Recommended ONNX LLMs:

| Model | Size | Context | Vision | | ----- | ---- | ------- | ------ | | onnx-community/Qwen3.5-0.8B-ONNX | ~500MB | 32K | Yes | | onnx-community/Qwen3.5-2B-ONNX | ~1.5GB | 32K | Yes | | onnx-community/Qwen3.5-4B-ONNX | ~2.5GB | 32K | Yes | | onnx-community/SmolLM2-360M-Instruct | ~200MB | 2K | No | | onnx-community/SmolLM2-135M-Instruct | ~80MB | 2K | No |

Vision support: Qwen3.5 models support image input via their built-in vision encoder. Check model.supportsVision for feature detection. See Vision docs for usage.


Model Utilities

import { preloadModel, isModelCached, getModelStorageUsage } from '@localmode/transformers';

const cached = await isModelCached('Xenova/bge-small-en-v1.5');

await preloadModel('Xenova/bge-small-en-v1.5', {
  onProgress: (p) => console.log(`${p.progress}% loaded`),
});

const usage = await getModelStorageUsage();

Recommended Models

Embeddings

| Model | Description | | ----- | ----------- | | Xenova/bge-small-en-v1.5 | Fast, general-purpose (~22MB, 384d) | | Xenova/paraphrase-multilingual-MiniLM-L12-v2 | 50+ languages (~120MB, 384d) | | Xenova/all-mpnet-base-v2 | Higher quality (~420MB, 768d) | | Snowflake/snowflake-arctic-embed-xs | Tiny retrieval embeddings (~23MB, 384d) |

Reranking

| Model | Description | | ----- | ----------- | | Xenova/ms-marco-MiniLM-L-6-v2 | Fast, small (~23MB, recommended) |

Text Classification

| Model | Description | | ----- | ----------- | | Xenova/distilbert-base-uncased-finetuned-sst-2-english | Sentiment analysis | | Xenova/twitter-roberta-base-sentiment-latest | Twitter sentiment |

Zero-Shot Classification

| Model | Description | | ----- | ----------- | | Xenova/mobilebert-uncased-mnli | Fast, mobile-friendly (~21MB) | | Xenova/nli-deberta-v3-xsmall | Mid-tier accuracy (~90MB) |

Named Entity Recognition

| Model | Description | | ----- | ----------- | | Xenova/bert-base-NER | Standard NER (PER, ORG, LOC, MISC) |

Translation

| Model | Description | | ----- | ----------- | | Xenova/opus-mt-en-de | English to German | | Xenova/opus-mt-en-fr | English to French | | Xenova/opus-mt-en-es | English to Spanish |

Summarization

| Model | Description | | ----- | ----------- | | Xenova/distilbart-cnn-6-6 | Best quality browser summarizer (~284MB) |

Fill-Mask

| Model | Description | | ----- | ----------- | | onnx-community/ModernBERT-base-ONNX | General purpose (mask: [MASK]) |

Question Answering

| Model | Description | | ----- | ----------- | | Xenova/distilbert-base-cased-distilled-squad | SQuAD trained (~65MB) |

Speech-to-Text

| Model | Description | | ----- | ----------- | | onnx-community/moonshine-tiny-ONNX | Fast, edge-optimized (~50MB) | | onnx-community/moonshine-base-ONNX | Best quality/size ratio (~237MB) |

Text-to-Speech

| Model | Description | | ----- | ----------- | | onnx-community/Kokoro-82M-v1.0-ONNX | Natural speech, 28 voices (~86MB) |

Image Classification

| Model | Description | | ----- | ----------- | | Xenova/vit-base-patch16-224 | General image classification | | Xenova/siglip-base-patch16-224 | Zero-shot image classification (~400MB) |

Image Captioning

| Model | Description | | ----- | ----------- | | onnx-community/Florence-2-base-ft | High-quality captions (~223MB) |

Image Segmentation

| Model | Description | | ----- | ----------- | | Xenova/segformer-b0-finetuned-ade-512-512 | Semantic segmentation (ADE20K) |

Object Detection

| Model | Description | | ----- | ----------- | | onnx-community/dfine_n_coco-ONNX | State-of-the-art, tiny (~4.5MB) | | Xenova/yolos-tiny | Fast detection |

Image Features

| Model | Description | | ----- | ----------- | | Xenova/siglip-base-patch16-224 | Image embeddings (768d) | | onnx-community/dinov2-base-ONNX | Self-supervised features |

Image Super Resolution

| Model | Description | | ----- | ----------- | | Xenova/swin2SR-lightweight-x2-64 | 2x upscale, fast | | Xenova/swin2SR-classical-sr-x4-64 | 4x upscale |

OCR

| Model | Description | | ----- | ----------- | | Xenova/trocr-small-printed | Printed text (~120MB) | | Xenova/trocr-small-handwritten | Handwritten text (~120MB) |

Document QA

| Model | Description | | ----- | ----------- | | onnx-community/Florence-2-base-ft | Document QA (~223MB) | | Xenova/donut-base-finetuned-docvqa | Donut (~218MB) |


Model Constants

All recommended models are exported as constants for easy reference:

import {
  MODELS,                      // All models organized by task
  EMBEDDING_MODELS,
  CLASSIFICATION_MODELS,
  ZERO_SHOT_MODELS,
  NER_MODELS,
  RERANKER_MODELS,
  SPEECH_TO_TEXT_MODELS,
  TEXT_TO_SPEECH_MODELS,
  IMAGE_CLASSIFICATION_MODELS,
  ZERO_SHOT_IMAGE_MODELS,
  IMAGE_CAPTION_MODELS,
  TRANSLATION_MODELS,
  SUMMARIZATION_MODELS,
  FILL_MASK_MODELS,
  QUESTION_ANSWERING_MODELS,
  OBJECT_DETECTION_MODELS,
  SEGMENTATION_MODELS,
  OCR_MODELS,
  DOCUMENT_QA_MODELS,
  IMAGE_TO_IMAGE_MODELS,
  IMAGE_FEATURE_MODELS,
} from '@localmode/transformers';

// Use with provider
const model = transformers.embedding(EMBEDDING_MODELS.BGE_SMALL_EN);

Advanced Usage

Custom Model Options

const model = transformers.embedding('Xenova/bge-small-en-v1.5', {
  quantized: true, // Use quantized model (smaller, faster)
  device: 'webgpu', // Use WebGPU for acceleration (falls back to WASM)
});

Provider Options

Pass provider-specific options to core functions:

const { embedding } = await embed({
  model: transformers.embedding('Xenova/bge-small-en-v1.5'),
  value: 'Hello world',
  providerOptions: {
    transformers: {
      // Any Transformers.js specific options
    },
  },
});

Preloading Models

For better UX, preload models before use:

import { preloadModel, isModelCached } from '@localmode/transformers';
import { embed } from '@localmode/core';

if (!(await isModelCached('Xenova/bge-small-en-v1.5'))) {
  await preloadModel('Xenova/bge-small-en-v1.5', {
    onProgress: (p) => console.log(`Loading: ${p.progress}%`),
  });
}

// Subsequent calls are instant (loaded from cache)
const embeddingModel = transformers.embedding('Xenova/bge-small-en-v1.5');
const { embedding } = await embed({ model: embeddingModel, value: 'Hello' });

Exported Implementation Classes

For advanced use cases, implementation classes are available:

import {
  TransformersEmbeddingModel,
  TransformersClassificationModel,
  TransformersZeroShotModel,
  TransformersNERModel,
  TransformersRerankerModel,
  TransformersSpeechToTextModel,
  TransformersImageClassificationModel,
  TransformersZeroShotImageModel,
  TransformersCaptionModel,
} from '@localmode/transformers';

Additional implementation classes can be imported from the implementations subpath:

import {
  TransformersSegmentationModel,
  TransformersObjectDetectionModel,
  TransformersImageFeatureModel,
  TransformersImageToImageModel,
  TransformersTextToSpeechModel,
  TransformersAudioClassificationModel,
  TransformersZeroShotAudioClassificationModel,
  TransformersTranslationModel,
  TransformersSummarizationModel,
  TransformersFillMaskModel,
  TransformersQuestionAnsweringModel,
  TransformersOCRModel,
  TransformersDocumentQAModel,
  TransformersDepthEstimationModel,
} from '@localmode/transformers/implementations';

Browser Compatibility

| Browser | WebGPU | WASM | Notes | | ----------- | ------ | ---- | ---------------------------- | | Chrome 113+ | ✅ | ✅ | Best performance with WebGPU | | Edge 113+ | ✅ | ✅ | Same as Chrome | | Firefox | ❌ | ✅ | WASM only | | Safari 18+ | ✅ | ✅ | WebGPU available | | iOS Safari | ✅ | ✅ | WebGPU available (iOS 26+) |

Performance Tips

  1. Use quantized models - Smaller and faster with minimal quality loss
  2. Preload models - Load during app init for instant inference
  3. Use WebGPU when available - 3-5x faster than WASM
  4. Batch operations - Process multiple inputs together

Acknowledgments

This package is built on Transformers.js by HuggingFace — state-of-the-art ML models running in the browser via ONNX Runtime.

License

MIT