npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@built-in-ai/transformers-js

v0.3.3

Published

Transformers.js provider for Vercel AI SDK - Run 🤗 Transformers directly in the browser with WebGPU support

Readme

Transformers.js provider for Vercel AI SDK

NPM Version NPM Downloads

[!NOTE] This library is in active development. Expect frequent updates.

Transformers.js provider for the Vercel AI SDK. Run popular open-source 🤗 Transformers models directly in the browser OR even server-side.

Installation

npm i @built-in-ai/transformers-js

The @built-in-ai/transformers-js package is the AI SDK provider for Transformers models powered by the official @huggingface/transformers library. It supports both client-side (browser) and server-side (Node.js) inference.

Requirements

Browser (Client-side)

  • A modern browser with WebAssembly/WebGPU support
  • WebGPU is strongly recommended for optimal performance

Server (Node.js)

  • Node.js 18+ recommended
  • Automatic CPU inference
  • GPU acceleration available with appropriate hardware setup

Usage

Basic Usage (chat)

import { streamText } from "ai";
import { transformersJS } from "@built-in-ai/transformers-js";

const result = streamText({
  // or generateText
  model: transformersJS("HuggingFaceTB/SmolLM2-360M-Instruct"),
  messages: [{ role: "user", content: "Hello, how are you?" }],
});

for await (const chunk of result.textStream) {
  console.log(chunk);
}

Tool calling

Be aware that some models might struggle with this. If you want to try it out with best succes, I suggest using a reasoning model (Qwen3).

const result = streamText({
  model: transformersJS("onnx-community/Qwen3-0.6B-ONNX"),
  tools: {
    weather: tool({
      description: "Get the weather in a location",
      inputSchema: z.object({
        location: z.string().describe("The location to get the weather for"),
      }),
      execute: async ({ location }) => ({
        location,
        temperature: 72 + Math.floor(Math.random() * 21) - 10,
      }),
    }),
  },
  stopWhen: stepCountIs(5),
  prompt: "What is the weather in San Francisco?",
});

And then in your useChat use sendAutomaticallyWhen: lastAssistantMessageIsCompleteWithToolCalls.

Server-side Inference

Although Transformers.js was originally designed to be used in the browser, it's also able to run inference on the server. This enables hybrid applications where you can seamlessly switch between client-side and server-side inference.

Basic Server-side Usage

// In a Next.js API route (app/api/chat/route.ts)
import { streamText } from "ai";
import { transformersJS } from "@built-in-ai/transformers-js";

export async function POST(req: Request) {
  const { messages } = await req.json();

  const model = transformersJS("HuggingFaceTB/SmolLM2-135M-Instruct"); // Be mindful not to pick a too large model

  const result = streamText({
    model,
    messages,
    temperature: 0.7,
  });

  return result.toUIMessageStreamResponse();
}

Hybrid Client/Server Setup

You can create applications that automatically choose between client-side and server-side inference:

import {
  transformersJS,
  doesBrowserSupportTransformersJS,
} from "@built-in-ai/transformers-js";

const useClientSide = doesBrowserSupportTransformersJS();

if (useClientSide) {
  // Browser: Use WebGPU for fast client-side inference
  const model = transformersJS("HuggingFaceTB/SmolLM2-360M-Instruct", {
    device: "webgpu",
    worker: new Worker(new URL("./worker.ts", import.meta.url), {
      type: "module",
    }),
  });
} else {
  // Fallback: Use server-side API route
  // (handled by your application routing)
}

Look at the Complete Hybrid Example section for more information.

Vision Models

import { streamText } from "ai";
import { transformersJS } from "@built-in-ai/transformers-js";

const model = transformersJS("HuggingFaceTB/SmolVLM-256M-Instruct", {
  isVisionModel: true,
  device: "webgpu",
});

const result = streamText({
  model,
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "Describe this image" },
        { type: "image", image: someImageBlobOrUrl },
      ],
    },
  ],
});

Advanced Usage (Web Worker)

Heavy model execution should preferably run off the main thread using Web Workers. This package ships a ready-to-use handler, which removes complexity and allows you to just build your app.

  1. Create worker.ts:
import { TransformersJSWorkerHandler } from "@built-in-ai/transformers-js";

const handler = new TransformersJSWorkerHandler();
self.onmessage = (msg: MessageEvent) => {
  handler.onmessage(msg);
};
  1. Provide the worker when creating the model:
import { streamText } from "ai";
import { transformersJS } from "@built-in-ai/transformers-js";

const model = transformersJS("HuggingFaceTB/SmolLM2-360M-Instruct", {
  worker: new Worker(new URL("./worker.ts", import.meta.url), {
    type: "module",
  }),
  device: "webgpu",
});

const result = streamText({
  model,
  messages: [{ role: "user", content: "Explain Web Workers briefly." }],
});

for await (const chunk of result.textStream) {
  console.log(chunk);
}

Download Progress Tracking (browser)

When a model is first used, weights and tokenizer files must be loaded. Progress is handled internally and streamed back.

Basic Progress Monitoring

import { streamText } from "ai";
import { transformersJS } from "@built-in-ai/transformers-js";

const model = transformersJS("HuggingFaceTB/SmolLM2-360M-Instruct");
const availability = await model.availability();

if (availability === "downloadable") {
  await model.createSessionWithProgress(({ progress }) => {
    console.log(`Download progress: ${Math.round(progress * 100)}%`);
  });
}

// Ready to use
const result = streamText({
  model,
  messages: [{ role: "user", content: "Hello!" }],
});

Integration with useChat Hook

When using this library with the useChat hook, you can create hybrid applications that seamlessly switch between client-side and server-side inference.

See the complete working example in examples/next-hybrid/ which includes:

  • Automatic fallback: Client-side when supported, server-side otherwise
  • Download progress: Real-time progress tracking for model downloads
  • Error handling: Graceful error handling and notifications
  • Full integration: Complete integration with useChat hook

Embeddings

This package supports text embeddings using Transformers.js with full type safety and proper tensor handling.

Basic Embedding Usage

import { embed, embedMany } from "ai";
import { transformersJS } from "@built-in-ai/transformers-js";

// Single embedding
const { embedding } = await embed({
  model: transformersJS.textEmbedding("Supabase/gte-small"),
  value: "sunny day at the beach",
});

console.log(`Dimensions: ${embedding.length}`); // 384 for gte-small

// Multiple embeddings with automatic chunking
const { embeddings } = await embedMany({
  model: transformersJS.textEmbedding("Supabase/gte-small"),
  values: [
    "sunny day at the beach",
    "rainy afternoon in the city",
    "snowy night in the mountains",
  ],
});

console.log(`Generated ${embeddings.length} embeddings`);

Advanced Embedding Configuration

const embeddingModel = transformersJS.textEmbedding("Supabase/gte-small", {
  device: "webgpu", // Use WebGPU for acceleration
  dtype: "q8", // Quantization level
  normalize: true, // Normalize embeddings (default: true)
  pooling: "mean", // Pooling strategy: 'mean', 'cls', or 'max'
  maxTokens: 512, // Maximum input tokens
});

const { embedding } = await embed({
  model: embeddingModel,
  value: "your text here",
});

Embedding Progress Tracking

const embeddingModel = transformersJS.textEmbedding("Supabase/gte-small");

// Check if model needs downloading
const availability = await embeddingModel.availability();

if (availability === "downloadable") {
  await embeddingModel.createSessionWithProgress(({ progress }) => {
    console.log(`Embedding model download: ${Math.round(progress * 100)}%`);
  });
}

// Now ready to use
const { embedding } = await embed({
  model: embeddingModel,
  value: "your text here",
});

Transcription

This package supports audio transcription using Transformers.js models with type safety and proper audio handling.

Basic Transcription Usage

import { experimental_transcribe as transcribe } from "ai";
import { transformersJS } from "@built-in-ai/transformers-js";

// Basic transcription
const transcript = await transcribe({
  model: transformersJS.transcription("Xenova/whisper-base"),
  audio: audioFile,
});

console.log(transcript.text); // Transcripted text string
console.log(transcript.segments); // Array of segments with timestamps (if available)
console.log(transcript.language); // Detected language (if available)
console.log(transcript.durationInSeconds); // Audio duration (if available)

Advanced Transcription Configuration

const transcriptionModel = transformersJS.transcription("Xenova/whisper-base", {
  device: "webgpu", // Use WebGPU for acceleration
  dtype: "fp16", // Data type for model weights
  maxNewTokens: 448, // Maximum tokens to generate
  language: "en", // Language hint for better accuracy
  returnTimestamps: true, // Return segment timestamps
});

const transcript = await transcribe({
  model: transcriptionModel,
  audio: audioBuffer,
  providerOptions: {
    "transformers-js": {
      language: "fr", // Override language per request
      returnTimestamps: true, // Enable timestamps
      maxNewTokens: 512, // Override max tokens
    },
  },
});

Transcription with Web Worker (Browser)

For better performance and to avoid blocking the main thread, you can run transcription in a Web Worker:

1. Create a worker file (worker.ts):

import { TransformersJSTranscriptionWorkerHandler } from "@built-in-ai/transformers-js";

const handler = new TransformersJSTranscriptionWorkerHandler();
self.onmessage = (msg: MessageEvent) => {
  handler.onmessage(msg);
};

2. Use the worker in your application:

// Create a Web Worker for off-main-thread processing
const worker = new Worker("/whisper-worker.js", { type: "module" });

const transcriptionModel = transformersJS.transcription(
  "onnx-community/whisper-base",
  {
    worker: worker,
    device: "webgpu",
  },
);

const transcript = await transcribe({
  model: transcriptionModel,
  audio: audioFile,
});

console.log(transcript.text);

That's it! The worker handler takes care of all the complex model loading, audio processing, and communication with the main thread.

Transcription Progress Tracking

To also keep track of model loading in your app, you can use the createSessionWithProgress as shown below:

const transcriptionModel = transformersJS.transcription("Xenova/whisper-base");

// Check if model needs downloading
const availability = await transcriptionModel.availability();

if (availability === "downloadable") {
  await transcriptionModel.createSessionWithProgress(({ progress }) => {
    console.log(`Transcription model download: ${Math.round(progress * 100)}%`);
  });
}

// Now ready to transcribe
const transcript = await transcribe({
  model: transcriptionModel,
  audio: audioData,
});

Supported Audio Formats

The transcription model accepts audio in multiple formats:

// Uint8Array (raw audio bytes)
const audioBytes = new Uint8Array(audioBuffer);
await transcribe({ model, audio: audioBytes });

// ArrayBuffer
const audioBuffer = await audioFile.arrayBuffer();
await transcribe({ model, audio: audioBuffer });

// Base64 encoded string
const audioBase64 = btoa(audioString);
await transcribe({ model, audio: audioBase64 });

// File from disk (Node.js)
const audioFile = await readFile("recording.wav");
await transcribe({ model, audio: audioFile });

API Reference

transformersJS(modelId, settings?)

Creates a Transformers.js language model instance.

Parameters:

  • modelId: A Hugging Face model ID (e.g. "HuggingFaceTB/SmolLM2-360M-Instruct", "onnx-community/Llama-3.2-1B-Instruct-q4f16")
  • settings (optional):
    • device?: "auto" | "cpu" | "webgpu" | "gpu" – Inference device (default: "auto")
    • dtype?: "auto" | "fp32" | "fp16" | "q8" | "q4" | "q4f16" – Data type for model weights (default: "auto")
    • isVisionModel?: boolean – Whether this is a vision model that can process images (default: false)
    • worker?: Worker – Optional Web Worker to run the model off the main thread (browser only)
    • initProgressCallback?: (progress: { progress: number }) => void – Progress callback for model initialization
    • rawInitProgressCallback?: (progress: ProgressInfo) => void – Raw progress callback from Transformers.js

Returns: TransformersJSLanguageModel

transformersJS.languageModel(modelId, settings?)

Alias for transformersJS(modelId, settings?). Creates a language model instance.

transformersJS.textEmbedding(modelId, settings?)

Creates a Transformers.js embedding model instance with proper type safety.

Parameters:

  • modelId: A Hugging Face embedding model ID (e.g. "Supabase/gte-small", "Xenova/all-MiniLM-L6-v2")
  • settings (optional):
    • device?: "auto" | "cpu" | "webgpu" – Inference device (default: "auto")
    • dtype?: "auto" | "fp32" | "fp16" | "q8" | "q4" | "q4f16" – Data type for model weights (default: "auto")
    • normalize?: boolean (default: true) – Whether to normalize embeddings
    • pooling?: "mean" | "cls" | "max" (default: "mean") – Pooling strategy for token embeddings
    • maxTokens?: number (default: 512) – Maximum input tokens
    • initProgressCallback?: (progress: { progress: number }) => void – Progress callback for model initialization
    • rawInitProgressCallback?: (progress: ProgressInfo) => void – Raw progress callback from Transformers.js

Returns: TransformersJSEmbeddingModel

transformersJS.textEmbeddingModel(modelId, settings?)

Alias for transformersJS.textEmbedding(modelId, settings?). Creates an embedding model instance.

TransformersJSEmbeddingModel.availability()

Checks current availability status for the embedding model.

Returns: Promise<"unavailable" | "downloadable" | "available">

TransformersJSEmbeddingModel.createSessionWithProgress(onProgress?)

Creates/initializes an embedding model session with optional progress monitoring.

Parameters:

  • onProgress?: (p: { progress: number }) => void

Returns: Promise<TransformersJSEmbeddingModel>

transformersJS.transcription(modelId, settings?)

Creates a Transformers.js transcription model instance with proper type safety.

Parameters:

  • modelId: A Hugging Face Whisper model ID (e.g. "Xenova/whisper-base", "openai/whisper-small", "openai/whisper-large-v3")
  • settings (optional):
    • device?: "auto" | "cpu" | "webgpu" – Inference device (default: "auto")
    • dtype?: "auto" | "fp32" | "fp16" | "q8" | "q4" – Data type for model weights (default: "auto")
    • maxNewTokens?: number (default: 448) – Maximum number of tokens to generate
    • language?: string – Language hint for better transcription accuracy
    • returnTimestamps?: boolean (default: false) – Return timestamps for segments
    • worker?: Worker – Optional Web Worker to run the model off the main thread (browser only)
    • initProgressCallback?: (progress: { progress: number }) => void – Progress callback for model initialization
    • rawInitProgressCallback?: (progress: ProgressInfo) => void – Raw progress callback from Transformers.js

Returns: TransformersJSTranscriptionModel

transformersJS.transcriptionModel(modelId, settings?)

Alias for transformersJS.transcription(modelId, settings?). Creates a transcription model instance.

TransformersJSTranscriptionModel.availability()

Checks current availability status for the transcription model.

Returns: Promise<"unavailable" | "downloadable" | "available">

TransformersJSTranscriptionModel.createSessionWithProgress(onProgress?)

Creates/initializes a transcription model session with optional progress monitoring.

Parameters:

  • onProgress?: (p: { progress: number }) => void

Returns: Promise<TransformersJSTranscriptionModel>

TransformersJSTranscriptionWorkerHandler

A worker handler class that simplifies running transcription models in Web Workers.

doesBrowserSupportTransformersJS(): boolean

Checks if the browser supports TransformersJS with optimal performance. Returns true if the browser has WebGPU or WebAssembly support, false otherwise.

TransformersUIMessage

Extended UI message type for use with the useChat hook that includes custom data parts.

Type:

type TransformersUIMessage = UIMessage<
  never,
  {
    modelDownloadProgress: {
      status: "downloading" | "complete" | "error";
      progress?: number;
      message: string;
    };
    notification: {
      message: string;
      level: "info" | "warning" | "error";
    };
  }
>;

TransformersJSLanguageModel.createSessionWithProgress(onProgress?)

Creates/initializes a model session with optional progress monitoring.

Parameters:

  • onProgress?: (p: { progress: number }) => void

Returns: Promise<TransformersJSLanguageModel>

TransformersJSLanguageModel.availability()

Checks current availability status.

Returns: Promise<"unavailable" | "downloadable" | "available">

TransformersJSWorkerHandler

Utility handler for Web Worker usage. Attach it to the worker onmessage to handle load, generate, interrupt, and reset messages.

import { TransformersJSWorkerHandler } from "@built-in-ai/transformers-js";

const handler = new TransformersJSWorkerHandler();
self.onmessage = (msg: MessageEvent) => handler.onmessage(msg);

Author

2025 © Jakob Hoeg Mørk

Credits

The Hugging Face, Transformers.js, and Vercel AI SDK teams