npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@tahul/react-native-transformers

v0.1.10

Published

Run LLM from huggingface on react-native and Expo locally with onnxruntime.

Readme

react-native-transformers

NPM Version codecov TypeDoc

Run Hugging Face transformer models directly on your React Native and Expo applications with on-device inference. Support for text generation, image understanding, and multimodal AI - no cloud service required!

Overview

react-native-transformers empowers your mobile applications with comprehensive AI capabilities by running transformer models directly on the device. This means your app can generate text, understand images, create embeddings, and process multimodal content without sending data to external servers - enhancing privacy, reducing latency, and enabling offline functionality.

Built on top of ONNX Runtime, this library provides a streamlined API for integrating state-of-the-art language and vision models into your React Native and Expo applications with minimal configuration.

Key Features

  • On-device inference: Run AI models locally without requiring an internet connection
  • Multimodal support: Process text, images, and combined text-image inputs
  • Vision-language models: Generate descriptions from images, answer questions about visual content
  • Image embeddings: Create vector representations for similarity search and clustering
  • Privacy-focused: Keep user data on the device without sending it to external servers
  • Optimized performance: Leverages ONNX Runtime for efficient model execution on mobile CPUs
  • Simple API: Easy-to-use interface for model loading and inference
  • Expo compatibility: Works seamlessly with both Expo managed and bare workflows

Installation

1. Install peer dependencies

npm install onnxruntime-react-native

2. Install react-native-transformers

# React-Native
npm install react-native-transformers

# Expo
npx expo install react-native-transformers

3. Platform Configuration

Link the onnxruntime-react-native library:

npx react-native link onnxruntime-react-native

Add the Expo plugin configuration in app.json or app.config.js:

{ "expo": { "plugins": ["onnxruntime-react-native"] } }

4. Babel Configuration

Add the babel-plugin-transform-import-meta plugin to your Babel configuration:

// babel.config.js
module.exports = {
  // ... your existing config
  plugins: [
    // ... your existing plugins
    'babel-plugin-transform-import-meta',
  ],
};

You can follow this document to create config file, and you need to run npx expo start --clear to clear the Metro bundler cache.

5. Development Client Setup

For development and testing, it's required to use a development client instead of Expo Go due to the native code of ONNX Runtime and react-native-transformers.

You can set up a development client using one of these methods:

Peer Dependencies for Image Processing

If you plan to use image processing capabilities, you'll need to install these peer dependencies:

# For Expo managed workflow
npx expo install expo-gl expo-gl-cpp expo-asset expo-image-manipulator

# Or for bare React Native
npm install expo-gl expo-gl-cpp expo-asset expo-image-manipulator

Usage

Text Generation

import React, { useState, useEffect } from 'react';
import { View, Text, Button } from 'react-native';
import { Pipeline } from 'react-native-transformers';

export default function App() {
  const [output, setOutput] = useState('');
  const [isLoading, setIsLoading] = useState(false);
  const [isModelReady, setIsModelReady] = useState(false);

  // Load model on component mount
  useEffect(() => {
    loadModel();
  }, []);

  const loadModel = async () => {
    setIsLoading(true);
    try {
      // Load a small Llama model
      await Pipeline.TextGeneration.init(
        'Felladrin/onnx-Llama-160M-Chat-v1',
        'onnx/decoder_model_merged.onnx',
        {
          // The fetch function is required to download model files
          fetch: async (url) => {
            // In a real app, you might want to cache the downloaded files
            const response = await fetch(url);
            return response.url;
          },
        }
      );
      setIsModelReady(true);
    } catch (error) {
      console.error('Error loading model:', error);
      alert('Failed to load model: ' + error.message);
    } finally {
      setIsLoading(false);
    }
  };

  const generateText = () => {
    setOutput('');
    // Generate text from the prompt and update the UI as tokens are generated
    Pipeline.TextGeneration.generate(
      'Write a short poem about programming:',
      (text) => setOutput(text)
    );
  };

  return (
    <View style={{ padding: 20 }}>
      <Button
        title={isModelReady ? 'Generate Text' : 'Load Model'}
        onPress={isModelReady ? generateText : loadModel}
        disabled={isLoading}
      />
      <Text style={{ marginTop: 20 }}>
        {output || 'Generated text will appear here'}
      </Text>
    </View>
  );
}

With Custom Model Download

For Expo applications, use expo-file-system to download models with progress tracking:

import * as FileSystem from 'expo-file-system';
import { Pipeline } from 'react-native-transformers';

// In your model loading function
await Pipeline.TextGeneration.init('model-repo', 'model-file', {
  fetch: async (url) => {
    const localPath = FileSystem.cacheDirectory + url.split('/').pop();

    // Check if file already exists
    const fileInfo = await FileSystem.getInfoAsync(localPath);
    if (fileInfo.exists) {
      console.log('Model already downloaded, using cached version');
      return localPath;
    }

    // Download file with progress tracking
    const downloadResumable = FileSystem.createDownloadResumable(
      url,
      localPath,
      {},
      (progress) => {
        const percentComplete =
          progress.totalBytesWritten / progress.totalBytesExpectedToWrite;
        console.log(
          `Download progress: ${(percentComplete * 100).toFixed(1)}%`
        );
      }
    );

    const result = await downloadResumable.downloadAsync();
    return result?.uri;
  },
});

Image-Text Generation

Generate text descriptions from images or answer questions about images using vision-language models:

import React, { useState } from 'react';
import { View, Text, Button } from 'react-native';
import { Pipeline, ImageTensorUtils } from 'react-native-transformers';
import * as ImagePicker from 'expo-image-picker';

export default function ImageApp() {
  const [output, setOutput] = useState('');
  const [isModelReady, setIsModelReady] = useState(false);
  const [isProcessing, setIsProcessing] = useState(false);
  const [streamingText, setStreamingText] = useState('');
  const [mode, setMode] = useState<'describe' | 'question'>('describe');

  const loadImageTextModel = async () => {
    try {
      // Load a vision-language model like Moondream2
      await Pipeline.ImageTextGeneration.init(
        'Xenova/moondream2',
        'onnx/decoder_model_merged.onnx',
        {
          fetch: async (url) => {
            // Your model download logic here
            return url;
          },
        }
      );
      setIsModelReady(true);
    } catch (error) {
      console.error('Error loading model:', error);
    }
  };

  const processImage = async () => {
    try {
      setIsProcessing(true);
      setOutput('');
      setStreamingText('');

      // Pick an image from the device
      const result = await ImagePicker.launchImageLibraryAsync({
        mediaTypes: ImagePicker.MediaTypeOptions.Images,
        allowsEditing: true,
        quality: 1,
      });

      if (!result.canceled && result.assets[0]) {
        // Convert image to tensor format (this is a simplified example)
        // In a real app, you'd need to process the image into Float32Array
        // using react-native-vision-camera or similar
        const imageData = new Float32Array(224 * 224 * 3); // Placeholder
        const imageDims: [number, number, number] = [224, 224, 3];

        // Callback function to handle streaming text updates
        const handleTextStream = (text: string) => {
          setStreamingText(text);
          // Optional: Update UI with typing indicator or partial text
        };

        if (mode === 'describe') {
          // Generate description with streaming updates
          const description = await Pipeline.ImageTextGeneration.describe(
            imageData,
            imageDims,
            handleTextStream,
            { max_tokens: 100 }
          );
          setOutput(description);
        } else {
          // Answer a specific question about the image
          const answer = await Pipeline.ImageTextGeneration.answerQuestion(
            imageData,
            imageDims,
            'What colors do you see in this image?',
            handleTextStream,
            { max_tokens: 50 }
          );
          setOutput(answer);
        }
      }
    } catch (error) {
      console.error('Error processing image:', error);
      setOutput('Error: ' + error.message);
    } finally {
      setIsProcessing(false);
      setStreamingText('');
    }
  };

  const stopGeneration = () => {
    Pipeline.ImageTextGeneration.stop();
    setIsProcessing(false);
    setStreamingText('');
  };

  return (
    <View style={{ padding: 20 }}>
      {!isModelReady ? (
        <Button title="Load Image Model" onPress={loadImageTextModel} />
      ) : (
        <View>
          <View style={{ flexDirection: 'row', marginBottom: 10 }}>
            <Button
              title={mode === 'describe' ? 'Describe Mode' : 'Question Mode'}
              onPress={() => setMode(mode === 'describe' ? 'question' : 'describe')}
            />
          </View>
          
          {!isProcessing ? (
            <Button title="Pick & Process Image" onPress={processImage} />
          ) : (
            <Button title="Stop Generation" onPress={stopGeneration} />
          )}
          
          {/* Show streaming text while processing */}
          {isProcessing && streamingText && (
            <View style={{ marginTop: 10, padding: 10, backgroundColor: '#f0f0f0' }}>
              <Text style={{ fontStyle: 'italic' }}>Generating: {streamingText}</Text>
            </View>
          )}
          
          {/* Show final output */}
          {output && (
            <View style={{ marginTop: 20 }}>
              <Text style={{ fontWeight: 'bold' }}>
                {mode === 'describe' ? 'Description:' : 'Answer:'}
              </Text>
              <Text style={{ marginTop: 10 }}>{output}</Text>
            </View>
          )}
        </View>
      )}
    </View>
  );
}

Understanding the Callback System

The callback system in image-text generation provides real-time streaming of generated text. Here's how it works:

// The callback receives the current complete text generated so far
const handleTextStream = (text: string) => {
  console.log('Current text:', text);
  // text contains the full generated text up to this point
  
  // You can implement various UI patterns:
  
  // 1. Show streaming text with typing effect
  setStreamingText(text);
  
  // 2. Show word-by-word updates
  const words = text.split(' ');
  setCurrentWordCount(words.length);
  
  // 3. Show character count or progress
  setCharacterCount(text.length);
  
  // 4. Update UI with partial results
  if (text.length > 10) {
    setPreviewText(text.substring(0, 50) + '...');
  }
};

// Use in generation
const result = await Pipeline.ImageTextGeneration.describe(
  imageData,
  imageDims,
  handleTextStream,
  { max_tokens: 100 }
);

// After generation completes, 'result' contains the final text
console.log('Final result:', result);

Key Points:

  • The callback receives the complete text generated so far, not just new tokens
  • Callbacks are called for each token generated (real-time streaming)
  • The final result is also returned when generation completes
  • You can stop generation at any time using Pipeline.ImageTextGeneration.stop()

Image Embeddings

Generate vector embeddings from images for similarity search, clustering, or other ML tasks:

import React, { useState } from 'react';
import { View, Text, Button } from 'react-native';
import { Pipeline } from 'react-native-transformers';

export default function EmbeddingApp() {
  const [embeddings, setEmbeddings] = useState('');
  const [similarity, setSimilarity] = useState('');

  const loadEmbeddingModel = async () => {
    await Pipeline.ImageEmbedding.init(
      'Xenova/clip-vit-base-patch32',
      'onnx/vision_model.onnx',
      {
        fetch: async (url) => {
          // Your model download logic here
          return url;
        },
      }
    );
  };

  const generateEmbeddings = async () => {
    try {
      // Create sample image data (in real app, process actual images)
      const imageData1 = new Float32Array(224 * 224 * 3);
      const imageData2 = new Float32Array(224 * 224 * 3);
      const imageDims: [number, number, number] = [224, 224, 3];

      // Generate embeddings for both images
      const embedding1 = await Pipeline.ImageEmbedding.embedImage(
        imageData1,
        imageDims
      );
      const embedding2 = await Pipeline.ImageEmbedding.embedImage(
        imageData2,
        imageDims
      );

      // Calculate similarity between embeddings
      const similarity = Pipeline.ImageEmbedding.cosineSimilarity(
        embedding1,
        embedding2
      );

      setEmbeddings(`Embedding dimensions: ${embedding1.length}`);
      setSimilarity(`Similarity: ${similarity.toFixed(4)}`);

      // For multi-modal embeddings (image + text)
      const multiModalEmbedding = await Pipeline.ImageEmbedding.embedImageText(
        imageData1,
        imageDims,
        'A photo of a cat'
      );
    } catch (error) {
      console.error('Error generating embeddings:', error);
    }
  };

  return (
    <View style={{ padding: 20 }}>
      <Button title="Load Embedding Model" onPress={loadEmbeddingModel} />
      <Button title="Generate Embeddings" onPress={generateEmbeddings} />
      <Text>{embeddings}</Text>
      <Text>{similarity}</Text>
    </View>
  );
}

Image Processing Utilities

The library includes utilities for processing images into the format expected by vision models:

import { ImageTensorUtils } from 'react-native-transformers';

// Process raw image data into model-ready tensor
const imageData = new Float32Array(224 * 224 * 3); // RGB image data
const dimensions: [number, number, number] = [224, 224, 3];

const tensor = ImageTensorUtils.processImageToTensor(imageData, dimensions, {
  normalize: true,
  mean: [0.485, 0.456, 0.406], // ImageNet normalization
  std: [0.229, 0.224, 0.225],
  inputRange: 'byte', // Input values are 0-255
  format: 'NCHW', // Tensor format
});

// Validate image dimensions against model requirements
const validation = ImageTensorUtils.validateDimensions(
  [224, 224, 3],
  {
    expectedSize: [224, 224],
    expectedChannels: 3,
    minSize: [32, 32],
    maxSize: [512, 512],
  }
);

if (!validation.valid) {
  console.error('Image validation failed:', validation.error);
}

Image Processing Integration

// Vision models expect Float32Array input + image dimensions
let imageData: ImageData

// Use with image models - both parameters are required
const result = await imageTextModel.describe(imageData);
const embeddings = await imageEmbeddingModel.embed(imageData);

Supported Models

react-native-transformers works with ONNX-formatted models from Hugging Face. Here are some recommended models based on size and performance:

Text Models

| Model | Type | Size | Description | | ------------------------------------------------------------------------------------------------------------- | --------------- | ------ | ----------------------------------- | | Felladrin/onnx-Llama-160M-Chat-v1 | Text Generation | ~300MB | Small Llama model (160M parameters) | | microsoft/Phi-3-mini-4k-instruct-onnx-web | Text Generation | ~1.5GB | Microsoft's Phi-3-mini model | | Xenova/distilgpt2_onnx-quantized | Text Generation | ~165MB | Quantized DistilGPT-2 | | Xenova/tiny-mamba-onnx | Text Generation | ~85MB | Tiny Mamba model | | Xenova/all-MiniLM-L6-v2-onnx | Text Embedding | ~80MB | Sentence embedding model |

Vision Models

| Model | Type | Size | Description | | ----------------------------------------------------------------------------------------- | --------------------- | ------ | ---------------------------------------------- | | Xenova/moondream2 | Image-Text Generation | ~1.7GB | Vision-language model for image understanding | | Xenova/clip-vit-base-patch32 | Image Embedding | ~300MB | CLIP model for image-text embeddings | | Xenova/vit-base-patch16-224 | Image Classification | ~350MB | Vision Transformer for image classification |

Note: Vision models require image data to be processed into specific tensor formats. Use the provided ImageTensorUtils and image loading utilities for proper image preprocessing.

API Reference

For detailed API documentation, please visit our TypeDoc documentation.

Contributing

Contributions are welcome! See the contributing guide to learn how to contribute to the repository and the development workflow.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements

External Links