@localmode/transformers
v1.0.2
Published
Transformers.js provider for @localmode - implements all ML model interfaces
Maintainers
Readme
@localmode/transformers
HuggingFace Transformers.js provider for LocalMode AI Engine - run ML models locally in the browser.
Features
- 🚀 Browser-Native - Run ML models directly in the browser with WebGPU/WASM
- 🔒 Privacy-First - All processing happens locally, no data leaves the device
- 📦 Model Caching - Models are cached in IndexedDB for instant subsequent loads
- ⚡ Optimized - Uses quantized models for smaller size and faster inference
Installation
pnpm install @localmode/transformers @localmode/coreOverview
@localmode/transformers provides model implementations for the interfaces defined in @localmode/core. It wraps HuggingFace Transformers.js to enable local ML inference in the browser.
✅ Live Features
These features are production-ready and actively used in applications.
Embeddings
Generate text embeddings for semantic search, clustering, and similarity.
import { embed, embedMany } from '@localmode/core';
import { transformers } from '@localmode/transformers';
// Create embedding model
const embeddingModel = transformers.embedding('Xenova/all-MiniLM-L6-v2');
// Single embedding
const { embedding } = await embed({
model: embeddingModel,
value: 'Hello world',
});
// Batch embeddings
const { embeddings } = await embedMany({
model: embeddingModel,
values: ['Hello', 'World', 'How are you?'],
});| Method | Interface | Description |
| --------------------------------- | ---------------- | --------------- |
| transformers.embedding(modelId) | EmbeddingModel | Text embeddings |
Recommended Models:
Xenova/all-MiniLM-L6-v2- Fast, general-purpose (~22MB)Xenova/paraphrase-multilingual-MiniLM-L12-v2- 50+ languages
Reranking
Improve RAG accuracy by reranking search results.
import { rerank } from '@localmode/core';
import { transformers } from '@localmode/transformers';
const rerankerModel = transformers.reranker('Xenova/ms-marco-MiniLM-L-6-v2');
const { results } = await rerank({
model: rerankerModel,
query: 'What is machine learning?',
documents: ['ML is a subset of AI...', 'Python is a language...', 'Neural networks...'],
topK: 5,
});
console.log(results); // Sorted by relevance score| Method | Interface | Description |
| -------------------------------- | --------------- | ------------------ |
| transformers.reranker(modelId) | RerankerModel | Document reranking |
Recommended Models:
Xenova/ms-marco-MiniLM-L-6-v2- Document reranking for RAG
Model Utilities
Manage model loading and caching.
import { preloadModel, isModelCached, getModelStorageUsage } from '@localmode/transformers';
// Check if model is cached
const cached = await isModelCached('Xenova/all-MiniLM-L6-v2');
// Preload model with progress
await preloadModel('Xenova/all-MiniLM-L6-v2', {
onProgress: (p) => console.log(`${p.progress}% loaded`),
});
// Check storage usage
const usage = await getModelStorageUsage();🚧 Coming Soon
These features have interfaces defined and implementations available, but are under active development and testing.
Classification & NLP
| Method | Interface | Description |
| ------------------------------------------ | ----------------------------- | ----------------------------- |
| transformers.classifier(modelId) | ClassificationModel | Text classification |
| transformers.zeroShotClassifier(modelId) | ZeroShotClassificationModel | Zero-shot text classification |
| transformers.ner(modelId) | NERModel | Named Entity Recognition |
import { classify, extractEntities } from '@localmode/core';
import { transformers } from '@localmode/transformers';
// Text Classification
const sentiment = await classify({
model: transformers.classifier('Xenova/distilbert-base-uncased-finetuned-sst-2-english'),
text: 'I love this product!',
});
console.log(sentiment.label); // 'POSITIVE'
// Named Entity Recognition
const entities = await extractEntities({
model: transformers.ner('Xenova/bert-base-NER'),
text: 'John works at Microsoft in Seattle',
});
console.log(entities.entities);
// [{ entity: 'John', type: 'PER', ... }, { entity: 'Microsoft', type: 'ORG', ... }, ...]Translation & Summarization
| Method | Interface | Description |
| ----------------------------------------- | ------------------------ | ----------------------- |
| transformers.translator(modelId) | TranslationModel | Text translation |
| transformers.summarizer(modelId) | SummarizationModel | Text summarization |
| transformers.fillMask(modelId) | FillMaskModel | Masked token prediction |
| transformers.questionAnswering(modelId) | QuestionAnsweringModel | Extractive QA |
Audio
| Method | Interface | Description |
| ------------------------------------ | ------------------- | ---------------------------- |
| transformers.speechToText(modelId) | SpeechToTextModel | Speech-to-text transcription |
| transformers.textToSpeech(modelId) | TextToSpeechModel | Text-to-speech synthesis |
import { transcribe, synthesizeSpeech } from '@localmode/core';
import { transformers } from '@localmode/transformers';
// Speech-to-Text
const transcription = await transcribe({
model: transformers.speechToText('Xenova/whisper-tiny'),
audio: audioBlob,
returnTimestamps: true,
});
console.log(transcription.text);
// Text-to-Speech
const { audio, sampleRate } = await synthesizeSpeech({
model: transformers.textToSpeech('Xenova/speecht5_tts'),
text: 'Hello, how are you?',
});Vision
| Method | Interface | Description |
| ----------------------------------------------- | ---------------------------------- | --------------------------------------- |
| transformers.imageClassifier(modelId) | ImageClassificationModel | Image classification |
| transformers.zeroShotImageClassifier(modelId) | ZeroShotImageClassificationModel | Zero-shot image classification |
| transformers.captioner(modelId) | ImageCaptionModel | Image captioning |
| transformers.segmenter(modelId) | SegmentationModel | Image segmentation |
| transformers.objectDetector(modelId) | ObjectDetectionModel | Object detection |
| transformers.imageFeatures(modelId) | ImageFeatureModel | Image feature extraction |
| transformers.imageToImage(modelId) | ImageToImageModel | Image transformation / super resolution |
| transformers.ocr(modelId) | OCRModel | OCR (TrOCR) |
| transformers.documentQA(modelId) | DocumentQAModel | Document/Table question answering |
import { classifyImage, captionImage } from '@localmode/core';
import { transformers } from '@localmode/transformers';
// Image Classification
const classification = await classifyImage({
model: transformers.imageClassifier('Xenova/vit-base-patch16-224'),
image: imageBlob,
});
console.log(classification.predictions);
// Image Captioning
const caption = await captionImage({
model: transformers.captioner('Xenova/blip-image-captioning-base'),
image: imageBlob,
});
console.log(caption.caption);All Recommended Models
Live Features
Embeddings
Xenova/all-MiniLM-L6-v2- Fast, general-purpose (~22MB)Xenova/paraphrase-multilingual-MiniLM-L12-v2- 50+ languages
Reranking
Xenova/ms-marco-MiniLM-L-6-v2- Document reranking for RAG
Coming Soon Features
Text Classification
Xenova/distilbert-base-uncased-finetuned-sst-2-english- Sentiment analysisXenova/twitter-roberta-base-sentiment-latest- Twitter sentiment
Named Entity Recognition
Xenova/bert-base-NER- Standard NER (PER, ORG, LOC, MISC)
Translation
Xenova/opus-mt-en-de- English to GermanXenova/opus-mt-en-fr- English to FrenchXenova/nllb-200-distilled-600M- 200 languages
Summarization
Xenova/bart-large-cnn- News summarizationXenova/distilbart-cnn-12-6- Fast summarization
Fill-Mask
Xenova/bert-base-uncased- General purposeXenova/roberta-base- Better for some tasks
Question Answering
Xenova/distilbert-base-cased-distilled-squad- SQuAD trainedXenova/roberta-base-squad2- SQuAD 2.0 trained
Speech-to-Text
Xenova/whisper-tiny- Fast, smaller size (~70MB)Xenova/whisper-small- Better accuracy (~240MB)
Text-to-Speech
Xenova/speecht5_tts- Natural speech synthesis
Image Classification
Xenova/vit-base-patch16-224- General image classificationXenova/clip-vit-base-patch32- Zero-shot image classification
Image Captioning
Xenova/blip-image-captioning-base- High-quality captions
Image Segmentation
Xenova/segformer-b0-finetuned-ade-512-512- Fast segmentationXenova/detr-resnet-50-panoptic- Panoptic segmentation
Object Detection
Xenova/detr-resnet-50- COCO objectsXenova/yolos-tiny- Fast detection
Image Features
Xenova/clip-vit-base-patch32- Image embeddingsXenova/dinov2-small- Self-supervised features
OCR
Xenova/trocr-base-handwritten- Handwritten textXenova/trocr-base-printed- Printed text
Document QA
Xenova/donut-base-finetuned-docvqa- Document QAXenova/tapas-base-finetuned-wtq- Table QA
Advanced Usage
Custom Model Options
const model = transformers.embedding('Xenova/all-MiniLM-L6-v2', {
quantized: true, // Use quantized model (smaller, faster)
device: 'webgpu', // Use WebGPU for acceleration (falls back to WASM)
});Provider Options
Pass provider-specific options to core functions:
const { embedding } = await embed({
model: transformers.embedding('Xenova/all-MiniLM-L6-v2'),
value: 'Hello world',
providerOptions: {
transformers: {
// Any Transformers.js specific options
},
},
});Preloading Models
For better UX, preload models before use:
import { preloadModel, isModelCached } from '@localmode/transformers';
import { embed } from '@localmode/core';
// Check and preload if needed
if (!(await isModelCached('Xenova/all-MiniLM-L6-v2'))) {
await preloadModel('Xenova/all-MiniLM-L6-v2', {
onProgress: (p) => console.log(`Loading: ${p.progress}%`),
});
}
// Subsequent calls are instant (loaded from cache)
const embeddingModel = transformers.embedding('Xenova/all-MiniLM-L6-v2');
const { embedding } = await embed({ model: embeddingModel, value: 'Hello' });Browser Compatibility
| Browser | WebGPU | WASM | Notes | | ----------- | ------ | ---- | ---------------------------- | | Chrome 113+ | ✅ | ✅ | Best performance with WebGPU | | Edge 113+ | ✅ | ✅ | Same as Chrome | | Firefox | ❌ | ✅ | WASM only | | Safari 18+ | ✅ | ✅ | WebGPU available | | iOS Safari | ✅ | ✅ | WebGPU available (iOS 26+) |
Performance Tips
- Use quantized models - Smaller and faster with minimal quality loss
- Preload models - Load during app init for instant inference
- Use WebGPU when available - 3-5x faster than WASM
- Batch operations - Process multiple inputs together
