@cantoo/capacitor-onnx
v2.0.0
Published
Capacitor plugin for native ONNX Runtime inference on Android, iOS and Web
Readme
@cantoo/capacitor-onnx
Capacitor plugin for ONNX Runtime inference on Android, iOS and Web.
Migration from 1.x to 2.0
2.0.0 removes the plugin-side model cache. The plugin no longer downloads, validates, or stores model files — it is now a thin wrapper around ONNX Runtime sessions. The host app owns model storage and provides bytes (web) or a filesystem path (native).
Contract changes
LoadModelInputno longer acceptsurl,sha256,forceRedownload, ortimeoutMs. Pass eitherfilePath(iOS/Android) ormodelBuffer: Uint8Array(web).LoadModelResultno longer includesstatus(cache_hit/downloaded).- Methods
clearModelandclearAllCachehave been removed.release(modelId, version)still releases the in-memory ORT session. CapacitorOnnxWeb.setWebConfigno longer acceptscacheStorage— onlywasmPath.- Error codes
NETWORK_ERROR,INTEGRITY_ERROR, andMODEL_INTEGRITY_ERRORare no longer reachable.
Migration example
Before:
await CapacitorOnnx.loadModel({
modelId: 'demo-model',
version: '1.0.0',
url: 'https://example.com/model.onnx',
sha256: 'abc...',
});After (native, iOS/Android):
// Download/cache the model in your app code, e.g. via @capacitor/filesystem.
// Then pass the absolute or file:// path to the plugin.
await CapacitorOnnx.loadModel({
modelId: 'demo-model',
version: '1.0.0',
filePath: '/data/user/0/com.app/files/models/demo-model-1.0.0.onnx',
});After (web):
const response = await fetch('https://example.com/model.onnx');
const modelBuffer = new Uint8Array(await response.arrayBuffer());
await CapacitorOnnx.loadModel({
modelId: 'demo-model',
version: '1.0.0',
modelBuffer,
});Passing modelBuffer on iOS/Android or filePath on web rejects with MODEL_INVALID — the Capacitor bridge serializes Uint8Array inefficiently (base64 / number array), so native callers must always use filesystem paths.
Install
pnpm add @cantoo/capacitor-onnx
pnpm cap sync android
pnpm cap sync iosAndroid setup
pnpm cap sync android registers the plugin automatically; no manual MainActivity edits are required. The host app must satisfy:
minSdk≥ 24 (Android 7.0).compileSdk≥ 34.- JDK ≥ 17 on the build machine. The plugin targets Java 17 bytecode (
sourceCompatibility/targetCompatibility/kotlinOptions.jvmTarget = '17'), so any newer JDK (e.g. 21) also works — 17 is just the floor.
The com.microsoft.onnxruntime:onnxruntime-android dependency is bundled by the plugin's build.gradle — you do not need to add it yourself. Tune execution providers and threading through sessionOptions (see docs/android-optimization.md).
iOS setup
iOS supports both CocoaPods (default for Capacitor apps) and Swift Package Manager.
CocoaPods (recommended for Capacitor apps). pnpm cap sync ios registers the plugin automatically: the generated Podfile picks up CantooCapacitorOnnx.podspec from node_modules/@cantoo/capacitor-onnx, and pod install resolves onnxruntime-objc transitively. No manual Xcode steps are required.
Swift Package Manager (alternative). If the host app prefers SPM, skip the Podfile entry and add the plugin as a local package in Xcode (Package Dependencies → +, pointing to node_modules/@cantoo/capacitor-onnx). Xcode resolves onnxruntime-swift-package-manager transitively. Add the CapacitorOnnx product to the App target.
Requirements either way:
- Minimum deployment target: iOS 14.
- The native bridge is registered automatically via
CapacitorOnnxPlugin.m; no additional Swift code is required.
Web setup
onnxruntime-web requires the page to be served as a cross-origin isolated context — without it the multi-threaded WASM backend falls back (or fails) and SharedArrayBuffer is unavailable. The host page must be served with the following response headers:
Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corpPlus, any cross-origin asset the page loads (model files, WASM artifacts, fonts, images) needs Cross-Origin-Resource-Policy: cross-origin (or same-site) on its response, otherwise it will be blocked under COEP. CDN/Storage hosting your .onnx artifacts must also send permissive CORS headers (Access-Control-Allow-Origin).
For Web-only hosts (without Capacitor), import from the dedicated Web entrypoint and configure the WASM path before loadModel:
import { CapacitorOnnxWeb } from '@cantoo/capacitor-onnx/web';
CapacitorOnnxWeb.setWebConfig({
wasmPath: '/ort-wasm/',
});Symptoms of missing isolation/CORS: SharedArrayBuffer is not defined, NetworkError when fetching .wasm, or models silently downgrading to single-threaded execution.
API
The package exports:
CapacitorOnnxCapacitorOnnxWeb(from@cantoo/capacitor-onnx/webfor non-Capacitor hosts)- TypeScript interfaces from
definitions
Methods
| Method | Signature | Purpose | Notes |
| --- | --- | --- | --- |
| loadModel | (input: LoadModelInput) => Promise<LoadModelResult> | Creates an ONNX Runtime session from the model bytes (web) or file path (native), and optionally warms it up. Must be called once per modelId+version before run. | Native: pass filePath (absolute path or file:// URI). Web: pass modelBuffer: Uint8Array. Pass warmupInput (a RawTensor matching one valid input shape) to pay first-inference cost upfront, and sessionOptions to pick the execution provider / thread counts. The result includes executionProviderUsed. |
| run | (input: RunInput) => Promise<RunResult> | Runs inference on a previously loaded session. Resolves I/O names from session metadata, so the consumer only supplies inputTensor. | Calls to the same modelId+version are serialized by a per-session lock; different models run in parallel. Returns { logits, latencyMs }. Pre/post-processing is the consumer's responsibility. |
| release | (input: ReleaseModelInput) => Promise<void> | Releases the in-memory ONNX session for the given modelId+version. | Use to free RAM/GPU memory when you are done with a model. The host app is responsible for managing model files on disk. |
Type definitions for every input/result (e.g. LoadModelInput, RawTensor, SessionOptionsInput, PluginError) live in src/definitions.ts.
Example
import { Capacitor } from '@capacitor/core';
import { CapacitorOnnx } from '@cantoo/capacitor-onnx';
async function loadDemoModel() {
if (Capacitor.getPlatform() === 'web') {
const response = await fetch('https://example.com/model.onnx');
const modelBuffer = new Uint8Array(await response.arrayBuffer());
await CapacitorOnnx.loadModel({
modelId: 'demo-model',
version: '1.0.0',
modelBuffer,
});
return;
}
// On iOS/Android, the host app is responsible for downloading
// the model to the filesystem (e.g. via @capacitor/filesystem).
await CapacitorOnnx.loadModel({
modelId: 'demo-model',
version: '1.0.0',
filePath: '/absolute/path/to/model.onnx',
});
}
await loadDemoModel();
const result = await CapacitorOnnx.run({
modelId: 'demo-model',
version: '1.0.0',
inputTensor: {
type: 'float32',
dims: [1, 4],
data: [0.1, 0.2, 0.3, 0.4],
},
});
console.log(result.logits.dims, result.logits.data.length);
await CapacitorOnnx.release({ modelId: 'demo-model', version: '1.0.0' });Runtime Notes
loadModelsupports optionalwarmupInput: RawTensorto pre-run the session with a sample tensor of the exact shape the model expects (e.g.{ type: 'float32', dims: [1, 16000], data: [...] }). Warmup is skipped whenwarmupInputis omitted.loadModelreturnsexecutionProviderUsedwith the provider that was actually initialized.- Web provider selection supports
sessionOptions.executionProviderwithauto,wasm,webgpu,webnnplus native aliases (cpu/nnapi/coremlmapped towasmin Web). - In Web
automode, provider resolution tries accelerated providers first (webgpu,webnn) and falls back towasm. - iOS provider mapping:
cpu→ CPU,nnapi/coreml→ CoreML,auto→ CoreML with CPU fallback, web providers (wasm/webgpu/webnn) → CPU. runacceptsinputTensorand resolves model I/O names from session metadata (inputNames/outputNames) instead of hardcoded names.- Output shape:
RunResult.logits.dimsis the shape ORT materialized for the output tensor — Web readsoutputTensor.dims, Android readsOnnxTensor.info.shape, iOS readstensorTypeAndShapeInfo().shape. No heuristic, no symbolic dims (-1) in the result, no batch assumptions. Models with multiple independent dynamic axes are returned with their true runtime shape. - Errors are normalized with structured fields (
code,message,retryable,correlationId,details).
Docs
- Testing scripts and validation flow: docs/testing-scripts.md
License
MIT
