@joelzangue/react-native-silver-train
v0.1.0
Published
On-device ML inference for React Native - Run ONNX and CoreML models locally
Maintainers
Readme
react-native-silver-train
On-device ML inference for React Native. Run any ONNX or CoreML model locally on iOS with full Neural Engine support.
Features
✨ Universal Model Support - Run ANY ONNX or CoreML model
- 🤖 Text Generation - LLMs (TinyLlama, GPT-2, Llama, etc.)
- 🖼️ Image Classification - MobileNet, ResNet, EfficientNet
- 🎯 Object Detection - YOLO, SSD, Faster R-CNN
- 📊 Embeddings - Sentence-BERT, CLIP, custom models
- 🔧 Custom Models - Bring your own ONNX/CoreML model
⚡ Performance Optimized
- iOS Neural Engine acceleration
- Native Swift implementation for CoreML
- ONNX Runtime integration
- Real-time performance monitoring
🎯 Developer Friendly
- Full TypeScript support
- Two APIs: Simple text generation + Generic tensor inference
- React hooks for easy integration
- Comprehensive documentation
Installation
npm install react-native-silver-train
# or
yarn add react-native-silver-trainInstall peer dependencies:
npm install @react-native-async-storage/async-storage onnxruntime-react-native react-native-fs
# or
yarn add @react-native-async-storage/async-storage onnxruntime-react-native react-native-fsiOS Setup
cd ios && pod installQuick Start
Text Generation (LLMs)
import { MLInferenceManager } from 'react-native-silver-train';
const manager = MLInferenceManager.getInstance();
// Load your model
await manager.loadModel({
id: 'tinyllama',
name: 'TinyLlama',
path: '/path/to/model.onnx',
type: 'onnx',
size: 1400000000,
});
// Run inference
const result = await manager.runInference('tinyllama', 'Hello, how are you?', {
temperature: 0.8,
maxTokens: 50,
});
console.log(result.text);Generic Tensor Inference (Any Model)
import { MLInferenceManager } from 'react-native-silver-train';
import type { TensorInput } from 'react-native-silver-train';
const manager = MLInferenceManager.getInstance();
// Load image classification model
await manager.loadModel({
id: 'mobilenet',
name: 'MobileNet V2',
path: '/path/to/mobilenet.onnx',
type: 'onnx',
size: 14000000,
metadata: {
modelType: 'image-classification'
}
});
// Prepare input tensor (e.g., normalized image data)
const imageData = new Float32Array(224 * 224 * 3);
// ... fill with your preprocessed image data ...
const inputs: TensorInput[] = [{
name: 'input',
data: imageData,
shape: [1, 224, 224, 3],
type: 'float32'
}];
// Run inference
const result = await manager.runTensorInference('mobilenet', inputs);
const predictions = result.outputs['output'].data;Using React Hooks
import { useMLInference, useSystemMetrics } from 'react-native-silver-train';
function MyComponent() {
const { models, loadModel, loadState } = useMLInference();
const metrics = useSystemMetrics(5000); // Update every 5 seconds
return (
<View>
<Text>Battery: {metrics?.battery.level}%</Text>
<Text>Memory: {(metrics?.memory.used / 1024 / 1024).toFixed(2)} MB</Text>
<Text>Thermal: {metrics?.thermal.state}</Text>
</View>
);
}API Documentation
Two Inference APIs
1. Text Generation API (Optimized for LLMs)
runInference(
modelId: string,
input: string,
options?: InferenceOptions
): Promise<InferenceResult>Best for: Language models, text generation, chat
2. Generic Tensor API (Works with ANY model)
runTensorInference(
modelId: string,
inputs: TensorInput[] | Record<string, TensorInput>
): Promise<GenericInferenceResult>Best for: Image classification, object detection, embeddings, custom models
Core Services
MLInferenceManager- Main inference serviceModelCacheManager- Model caching and managementTinyLlamaTokenizer- Built-in tokenizer for Llama models
React Hooks
useMLInference()- Model loading and inference stateuseSystemMetrics(interval)- Real-time system monitoring
Native Modules
coreMLInference- CoreML model executioncoreMLTextGeneration- Optimized text generation with CoreMLgpt2Tokenizer- Native Swift BPE tokenizer (10x faster)performanceMonitor- System metrics monitoring
Model Configuration
interface ModelConfig {
id: string; // Unique identifier
name: string; // Display name
path: string; // File path to model
type: 'coreml' | 'onnx'; // Model type
size: number; // File size in bytes
// Optional: For text generation models
maxTokens?: number;
temperature?: number;
topP?: number;
// Optional: Model shape information
inputShape?: number[];
outputShape?: number[];
inputNames?: string[];
outputNames?: string[];
metadata?: {
modelType?: 'text-generation' | 'image-classification' | 'object-detection' | 'embedding' | 'custom';
description?: string;
version?: string;
};
}Supported Model Types
| Model Type | Examples | API |
|------------|----------|-----|
| Text Generation | TinyLlama, GPT-2, Llama 2/3 | runInference() |
| Image Classification | MobileNet, ResNet, EfficientNet | runTensorInference() |
| Object Detection | YOLO, SSD, Faster R-CNN | runTensorInference() |
| Embedding | Sentence-BERT, CLIP | runTensorInference() |
| Custom | Your ONNX/CoreML model | runTensorInference() |
Examples
Image Classification
// MobileNet example
const result = await manager.runTensorInference('mobilenet', [{
data: imageData,
shape: [1, 224, 224, 3],
type: 'float32'
}]);
const predictions = result.outputs['output'].data as Float32Array;
const topClass = predictions.indexOf(Math.max(...predictions));Object Detection
// YOLO example
const result = await manager.runTensorInference('yolo', [{
data: imageData,
shape: [1, 3, 640, 640],
type: 'float32'
}]);
const boxes = result.outputs['boxes'].data;
const scores = result.outputs['scores'].data;
const classes = result.outputs['classes'].data;Embeddings
// Sentence-BERT example
const result = await manager.runTensorInference('sentence-bert', {
'input_ids': {
data: new BigInt64Array(tokenIds),
shape: [1, tokenIds.length],
type: 'int64'
},
'attention_mask': {
data: new BigInt64Array(attentionMask),
shape: [1, attentionMask.length],
type: 'int64'
}
});
const embedding = result.outputs['last_hidden_state'].data;Advanced Usage
Performance Monitoring
// Check device capabilities
const capabilities = await manager.getDeviceCapabilities();
console.log('Neural Engine:', capabilities.hasNeuralEngine);
// Monitor during inference
const metrics = await manager.getSystemMetrics();
if (metrics.thermal.shouldThrottle) {
console.warn('Device is hot, reducing inference frequency');
}Streaming Text Generation
await manager.runStreamingInference(
'tinyllama',
'Write a story about',
{ maxTokens: 100 },
(token, isDone) => {
console.log(token);
if (isDone) {
console.log('Generation complete!');
}
}
);Model Caching
import { ModelCacheManager } from 'react-native-silver-train';
const cacheManager = ModelCacheManager.getInstance();
// Get all cached models
const models = cacheManager.getAllCachedModels();
// Clear cache
await cacheManager.clearCache();
// Get cache size
const size = await cacheManager.getCacheSize();Documentation
- 📖 Generic Inference Guide - Complete guide for any model type
- 🎯 API Reference - Full API documentation
- 💡 Examples - Code examples for common use cases
Requirements
- iOS 13.0 or higher
- React Native 0.70+
- CocoaPods
Performance
- Neural Engine: Automatic acceleration on iOS devices with Neural Engine (A11+)
- Native Tokenizers: 10x faster than JavaScript implementations
- Optimized Inference: Efficient memory management and thermal monitoring
Contributing
See CONTRIBUTING.md for guidelines on:
License
MIT
Acknowledgments
- Built with ONNX Runtime
- Powered by Apple's CoreML
- Created with create-react-native-library
Made with ❤️ for the React Native community
