@joelzangue/react-native-silver-train

v0.1.0

Published

a month ago

On-device ML inference for React Native - Run ONNX and CoreML models locally

0High
0Medium
0Low

joelzangue

react-native ios android machine-learning ml ai onnx coreml on-device inference neural-engine llm language-model

react-native-silver-train

On-device ML inference for React Native. Run any ONNX or CoreML model locally on iOS with full Neural Engine support.

Features

✨ Universal Model Support - Run ANY ONNX or CoreML model

🤖 Text Generation - LLMs (TinyLlama, GPT-2, Llama, etc.)
🖼️ Image Classification - MobileNet, ResNet, EfficientNet
🎯 Object Detection - YOLO, SSD, Faster R-CNN
📊 Embeddings - Sentence-BERT, CLIP, custom models
🔧 Custom Models - Bring your own ONNX/CoreML model

⚡ Performance Optimized

iOS Neural Engine acceleration
Native Swift implementation for CoreML
ONNX Runtime integration
Real-time performance monitoring

🎯 Developer Friendly

Full TypeScript support
Two APIs: Simple text generation + Generic tensor inference
React hooks for easy integration
Comprehensive documentation

Installation

npm install react-native-silver-train
# or
yarn add react-native-silver-train

Install peer dependencies:

npm install @react-native-async-storage/async-storage onnxruntime-react-native react-native-fs
# or
yarn add @react-native-async-storage/async-storage onnxruntime-react-native react-native-fs

iOS Setup

cd ios && pod install

Quick Start

Text Generation (LLMs)

import { MLInferenceManager } from 'react-native-silver-train';

const manager = MLInferenceManager.getInstance();

// Load your model
await manager.loadModel({
  id: 'tinyllama',
  name: 'TinyLlama',
  path: '/path/to/model.onnx',
  type: 'onnx',
  size: 1400000000,
});

// Run inference
const result = await manager.runInference('tinyllama', 'Hello, how are you?', {
  temperature: 0.8,
  maxTokens: 50,
});

console.log(result.text);

Generic Tensor Inference (Any Model)

import { MLInferenceManager } from 'react-native-silver-train';
import type { TensorInput } from 'react-native-silver-train';

const manager = MLInferenceManager.getInstance();

// Load image classification model
await manager.loadModel({
  id: 'mobilenet',
  name: 'MobileNet V2',
  path: '/path/to/mobilenet.onnx',
  type: 'onnx',
  size: 14000000,
  metadata: {
    modelType: 'image-classification'
  }
});

// Prepare input tensor (e.g., normalized image data)
const imageData = new Float32Array(224 * 224 * 3);
// ... fill with your preprocessed image data ...

const inputs: TensorInput[] = [{
  name: 'input',
  data: imageData,
  shape: [1, 224, 224, 3],
  type: 'float32'
}];

// Run inference
const result = await manager.runTensorInference('mobilenet', inputs);
const predictions = result.outputs['output'].data;

Using React Hooks

import { useMLInference, useSystemMetrics } from 'react-native-silver-train';

function MyComponent() {
  const { models, loadModel, loadState } = useMLInference();
  const metrics = useSystemMetrics(5000); // Update every 5 seconds

  return (
    <View>
      <Text>Battery: {metrics?.battery.level}%</Text>
      <Text>Memory: {(metrics?.memory.used / 1024 / 1024).toFixed(2)} MB</Text>
      <Text>Thermal: {metrics?.thermal.state}</Text>
    </View>
  );
}

API Documentation

Two Inference APIs

1. Text Generation API (Optimized for LLMs)

runInference(
  modelId: string,
  input: string,
  options?: InferenceOptions
): Promise<InferenceResult>

Best for: Language models, text generation, chat

2. Generic Tensor API (Works with ANY model)

runTensorInference(
  modelId: string,
  inputs: TensorInput[] | Record<string, TensorInput>
): Promise<GenericInferenceResult>

Best for: Image classification, object detection, embeddings, custom models

Core Services

MLInferenceManager - Main inference service
ModelCacheManager - Model caching and management
TinyLlamaTokenizer - Built-in tokenizer for Llama models

React Hooks

useMLInference() - Model loading and inference state
useSystemMetrics(interval) - Real-time system monitoring

Native Modules

coreMLInference - CoreML model execution
coreMLTextGeneration - Optimized text generation with CoreML
gpt2Tokenizer - Native Swift BPE tokenizer (10x faster)
performanceMonitor - System metrics monitoring

Model Configuration

interface ModelConfig {
  id: string;                   // Unique identifier
  name: string;                 // Display name
  path: string;                 // File path to model
  type: 'coreml' | 'onnx';     // Model type
  size: number;                 // File size in bytes

  // Optional: For text generation models
  maxTokens?: number;
  temperature?: number;
  topP?: number;

  // Optional: Model shape information
  inputShape?: number[];
  outputShape?: number[];
  inputNames?: string[];
  outputNames?: string[];

  metadata?: {
    modelType?: 'text-generation' | 'image-classification' | 'object-detection' | 'embedding' | 'custom';
    description?: string;
    version?: string;
  };
}

Supported Model Types

| Model Type | Examples | API | |------------|----------|-----| | Text Generation | TinyLlama, GPT-2, Llama 2/3 | runInference() | | Image Classification | MobileNet, ResNet, EfficientNet | runTensorInference() | | Object Detection | YOLO, SSD, Faster R-CNN | runTensorInference() | | Embedding | Sentence-BERT, CLIP | runTensorInference() | | Custom | Your ONNX/CoreML model | runTensorInference() |

Examples

Image Classification

// MobileNet example
const result = await manager.runTensorInference('mobilenet', [{
  data: imageData,
  shape: [1, 224, 224, 3],
  type: 'float32'
}]);

const predictions = result.outputs['output'].data as Float32Array;
const topClass = predictions.indexOf(Math.max(...predictions));

Object Detection

// YOLO example
const result = await manager.runTensorInference('yolo', [{
  data: imageData,
  shape: [1, 3, 640, 640],
  type: 'float32'
}]);

const boxes = result.outputs['boxes'].data;
const scores = result.outputs['scores'].data;
const classes = result.outputs['classes'].data;

Embeddings

// Sentence-BERT example
const result = await manager.runTensorInference('sentence-bert', {
  'input_ids': {
    data: new BigInt64Array(tokenIds),
    shape: [1, tokenIds.length],
    type: 'int64'
  },
  'attention_mask': {
    data: new BigInt64Array(attentionMask),
    shape: [1, attentionMask.length],
    type: 'int64'
  }
});

const embedding = result.outputs['last_hidden_state'].data;

Advanced Usage

Performance Monitoring

// Check device capabilities
const capabilities = await manager.getDeviceCapabilities();
console.log('Neural Engine:', capabilities.hasNeuralEngine);

// Monitor during inference
const metrics = await manager.getSystemMetrics();
if (metrics.thermal.shouldThrottle) {
  console.warn('Device is hot, reducing inference frequency');
}

Streaming Text Generation

await manager.runStreamingInference(
  'tinyllama',
  'Write a story about',
  { maxTokens: 100 },
  (token, isDone) => {
    console.log(token);
    if (isDone) {
      console.log('Generation complete!');
    }
  }
);

Model Caching

import { ModelCacheManager } from 'react-native-silver-train';

const cacheManager = ModelCacheManager.getInstance();

// Get all cached models
const models = cacheManager.getAllCachedModels();

// Clear cache
await cacheManager.clearCache();

// Get cache size
const size = await cacheManager.getCacheSize();

Documentation

📖 Generic Inference Guide - Complete guide for any model type
🎯 API Reference - Full API documentation
💡 Examples - Code examples for common use cases

Requirements

iOS 13.0 or higher
React Native 0.70+
CocoaPods

Performance

Neural Engine: Automatic acceleration on iOS devices with Neural Engine (A11+)
Native Tokenizers: 10x faster than JavaScript implementations
Optimized Inference: Efficient memory management and thermal monitoring

Contributing

See CONTRIBUTING.md for guidelines on:

License

MIT

Acknowledgments

Built with ONNX Runtime
Powered by Apple's CoreML
Created with create-react-native-library

Made with ❤️ for the React Native community