dust-onnx-capacitor
v0.1.11
Published
Capacitor plugin for on-device ONNX Runtime model loading over .onnx files.
Downloads
1,130
Maintainers
Readme
Demo
Run the full demo from a clean clone:
git clone https://github.com/rogelioRuiz/dust-onnx-capacitor
cd dust-onnx-capacitor && npm install
npm run test:ios # 22 tests (8 serve + 14 ONNX) — builds & installs app
npm run test:yolo-ios # YOLO inference — requires test:ios run firstAndroid: replace
test:ios/test:yolo-ioswithtest:android/test:yolo-android
Add --verbose for full build output (xcodebuild, gradlew, cap sync):
npm run test:ios:verbose
npm run test:android:verbosecapacitor-onnx
Capacitor plugin for on-device ONNX Runtime model loading, image preprocessing, and tensor inference over .onnx files.
Stage O1+O2+O3+O4+O5+O6 — model lifecycle management (load, unload, list, metadata), validated tensor I/O and single inference, JPEG/PNG image preprocessing to normalized NCHW tensors, hardware-accelerated execution providers (CoreML on iOS, NNAPI/XNNPACK on Android) with automatic CPU fallback, DustCore registry integration with ref-counted session lifecycle, priority-based eviction, and OS memory pressure handling, and multi-step pipeline inference with output-to-input chaining.
| | Android | iOS | Web | |---|---|---|---| | Runtime | ONNX Runtime 1.20.0 | onnxruntime-objc ~1.20 | Stub (throws) | | Min version | API 26 | iOS 16.0 | — | | Architecture | arm64-v8a only | arm64 + x86_64 sim | — |
Install
npm install dust-onnx-capacitor dust-core-capacitor
npx cap syncdust-core-capacitor is a required peer dependency — it provides the shared ML contract types (DustModelServer, DustModelSession, DustCoreError, etc.) that capacitor-onnx implements.
iOS (SPM)
The iOS build resolves onnxruntime-swift-package-manager via Swift Package Manager automatically on first cap sync. No CocoaPods step required.
Android
Add the Kotlin gradle plugin to android/build.gradle:
classpath 'org.jetbrains.kotlin:kotlin-gradle-plugin:2.1.20'Ensure minSdkVersion is at least 26 in android/variables.gradle.
API
import { ONNX } from 'capacitor-onnx';loadModel
const result = await ONNX.loadModel({
descriptor: {
id: 'my-model',
format: 'onnx',
url: '/absolute/path/to/model.onnx',
},
config: { // optional
accelerator: 'auto', // 'auto' | 'cpu' | 'nnapi' | 'coreml' | 'xnnpack' | 'metal'
threads: 4, // or { interOp: 2, intraOp: 4 }
graphOptLevel: 'all', // 'disable' | 'basic' | 'extended' | 'all'
memoryPattern: true,
},
priority: 0, // 0 = interactive, 1 = background
});
// result.modelId — string
// result.metadata — { inputs: TensorMetadata[], outputs: TensorMetadata[], accelerator, opset? }unloadModel
await ONNX.unloadModel({ modelId: 'my-model' });listLoadedModels
const { modelIds } = await ONNX.listLoadedModels();
// modelIds: string[]getModelMetadata
const metadata = await ONNX.getModelMetadata({ modelId: 'my-model' });
// metadata.inputs — [{ name, dtype, shape }]
// metadata.outputs — [{ name, dtype, shape }]runInference
const result = await ONNX.runInference({
modelId: 'my-model',
inputs: [
{ name: 'input_a', dtype: 'float32', shape: [1, 3], data: [1, 2, 3] },
{ name: 'input_b', dtype: 'float32', shape: [1, 3], data: [4, 5, 6] },
],
outputNames: ['output'], // optional — omit to return all outputs
});
// result.outputs — [{ name, dtype, shape, data }]Input validation runs before inference:
- Shape: rank and static dimensions must match model metadata (
-1dimensions are dynamic and accept any size) - Dtype: input dtype must match the model's expected dtype
runPipeline
const { results } = await ONNX.runPipeline({
modelId: 'my-model',
steps: [
{
inputs: [
{ name: 'input', shape: [1, 3], dtype: 'float32', data: [1, 2, 3] },
],
},
{
inputs: [
{ name: 'input', data: 'previous_output' }, // chain from step 0 output named 'input'
],
},
{
inputs: [
{ name: 'input', data: { fromStep: 0, outputName: 'output' } }, // explicit step reference
],
outputNames: ['output'],
},
],
});
// results — [{ outputs: [...] }, { outputs: [...] }, { outputs: [...] }]runPipeline executes multiple sequential inference steps on the same session within a single bridge call. This eliminates bridge round-trip overhead for multi-step workflows (e.g. PaddleOCR detection → recognition).
Step input types:
- Literal —
data: number[]withshapeanddtype— raw tensor data, same asrunInference 'previous_output'—data: 'previous_output'— substitutes the output tensor of the samenamefrom the immediately preceding step- Step reference —
data: { fromStep, outputName }— substitutes a named output from any earlier step
Error behavior: if any step fails, the pipeline halts immediately. The error message includes the failing step index (e.g. "Pipeline step 2 failed: ...").
Memory management: intermediate step results are released as soon as no future step references them.
preprocessImage
const { tensor } = await ONNX.preprocessImage({
data: base64Image, // base64 JPEG/PNG payload, no data: prefix
width: 224,
height: 224,
config: {
resize: 'letterbox', // 'stretch' | 'letterbox' | 'crop_center'
normalization: 'imagenet', // 'imagenet' | 'minus1_plus1' | 'zero_to_1' | 'none'
// mean: [0.5, 0.5, 0.5], // optional custom mean overrides normalization preset
// std: [0.5, 0.5, 0.5], // optional custom std overrides normalization preset
},
});
// tensor — { name: 'image', dtype: 'float32', shape: [1, 3, 224, 224], data: [...] }preprocessImage decodes JPEG/PNG bytes, resizes to the requested output dimensions, and returns a channel-first tensor ready to pass into runInference.
Resize modes:
stretch— scale directly to the target sizeletterbox— preserve aspect ratio and pad with RGB(114, 114, 114)crop_center— preserve aspect ratio, fill the target frame, and center-crop overflow
Normalization modes:
imagenet—(pixel / 255 - mean) / stdusing ImageNet RGB statisticsminus1_plus1—pixel / 127.5 - 1zero_to_1—pixel / 255none— raw0...255channel values
When config.mean and/or config.std are provided, the plugin applies ((pixel / 255) - mean) / std using those custom values instead of a preset normalization mode.
Accelerator selection
The config.accelerator field controls which ONNX Runtime execution provider (EP) is used:
| Value | Android | iOS |
|---|---|---|
| 'auto' | NNAPI | CoreML |
| 'cpu' | CPU | CPU |
| 'nnapi' | NNAPI | CPU (fallback) |
| 'coreml' | CPU (fallback) | CoreML |
| 'xnnpack' | XNNPACK | CPU (fallback) |
| 'metal' | CPU (fallback) | CPU (fallback) |
Fallback behavior: If the requested EP fails to initialize (e.g. NNAPI unavailable on emulator, CoreML unsupported model op), the plugin automatically retries with CPU-only options. The metadata.accelerator field in the result reflects the EP that was actually used.
CoreML model cache (iOS): When CoreML is selected, compiled .mlmodel files are cached in Application Support/onnx-cache/{modelId}/. ORT handles cache invalidation internally based on the model graph hash — subsequent loads of the same model skip recompilation.
Error codes
| Code | When |
|---|---|
| inferenceFailed | File not found, corrupt model, ORT load/run failure |
| formatUnsupported | descriptor.format is not 'onnx' |
| modelNotFound | unloadModel / getModelMetadata / runInference with unknown ID |
| invalidInput | Missing required fields |
| shapeError | runInference input shape does not match model metadata |
| dtypeError | runInference input dtype does not match model metadata |
| preprocessError | preprocessImage failed to decode or transform the image |
Types
type TensorDtype = 'float16' | 'float32' | 'float64' | 'int8' | 'int16' | 'int32' | 'int64' | 'uint8' | 'bool' | 'string' | 'unknown';
interface TensorMetadata {
name: string;
dtype: TensorDtype;
shape: number[];
}
interface TensorValue {
name: string;
data: number[];
shape: number[];
dtype?: TensorDtype; // defaults to 'float32'
}
interface InferenceTensorValue {
name: string;
data: number[];
shape: number[];
dtype: TensorDtype; // always present in outputs
}
type ResizeMode = 'stretch' | 'letterbox' | 'crop_center';
type NormalizationMode = 'imagenet' | 'minus1_plus1' | 'zero_to_1' | 'none';
interface PreprocessConfig {
resize?: ResizeMode;
normalization?: NormalizationMode;
mean?: [number, number, number];
std?: [number, number, number];
}
interface PreprocessResult {
tensor: InferenceTensorValue;
}
interface ONNXModelMetadata {
inputs: TensorMetadata[];
outputs: TensorMetadata[];
accelerator: string;
opset?: number;
}
interface TensorReference {
fromStep: number;
outputName: string;
}
interface PipelineStepInput {
name: string;
shape?: number[];
dtype?: TensorDtype;
data: number[] | 'previous_output' | TensorReference;
}
interface PipelineStep {
inputs: PipelineStepInput[];
outputNames?: string[];
}
interface RunPipelineResult {
results: RunInferenceResult[];
}Architecture
┌──────────────────────────────────────────┐
│ TypeScript API │
│ src/definitions.ts src/plugin.ts │
└─────────────┬────────────────────────────┘
│ Capacitor bridge
┌──────────┴──────────┐
▼ ▼
┌──────────────┐ ┌──────────────┐
│ Android │ │ iOS │
│ ONNXPlugin │ │ ONNXPlugin │
│ .kt │ │ .swift │
├──────────────┤ ├──────────────┤
│ONNXSession │ │ONNXSession │
│ Manager.kt │ │ Manager │
│ │ │ .swift │
├──────────────┤ ├──────────────┤
│ Accelerator │ │ Accelerator │
│ Selector.kt │ │ Selector │
│ (NNAPI/ │ │ .swift │
│ XNNPACK) │ │ (CoreML) │
├──────────────┤ ├──────────────┤
│ OrtSession │ │ORTSession │
│ Engine.kt │ │ Engine │
│ (ONNXEngine)│ │ .swift │
├──────────────┤ ├──────────────┤
│ onnxruntime │ │onnxruntime │
│ -android │ │ -objc │
│ 1.20.0 │ │ ~1.20 │
└──────────────┘ └──────────────┘
│ │
└────────┬────────┘
▼
dust-core-capacitor
(shared ML contracts)Both platforms use the same patterns:
- Dedicated inference thread/queue — Android:
HandlerThread, iOS:DispatchQueue - Thread-safe session cache — Android:
ReentrantLock, iOS:NSLock - Reference counting — loading the same model ID twice increments the ref count instead of creating a duplicate session
- ONNXEngine seam —
ONNXSessiondelegates inference to anONNXEngineprotocol/interface; production usesOrtSessionEngine(real ORT), unit tests inject aMockONNXEngine - ImagePreprocessor seam — JPEG/PNG decode, resize, normalization, and NCHW packing live in a pure
ImagePreprocessoron each platform with no Capacitor or ORT dependency - Pre-inference validation — shape rank/dimensions and dtype checked against model metadata before calling ORT
- Pipeline execution —
runPipelineexecutes sequential inference steps within a single bridge call, resolvingprevious_outputand{ fromStep, outputName }references between steps, with automatic release of intermediate tensors - AcceleratorSelector — pure function/struct that maps
acceleratorconfig to execution provider options; self-contained try/catch fallback to CPU on EP failure - DustCore registry — sessions are registered with
DustCoreRegistryfor cross-plugin discovery;loadModel(descriptor:priority:)flows through the sharedDustModelServerprotocol - Ref-counted session lifecycle —
unloadModeldecrements refCount and keeps the session cached;forceUnloadModelremoves it entirely;evictUnderPressureremoves zero-ref sessions by priority (.standard= background only,.critical= all) - OS memory pressure — iOS:
UIApplication.didReceiveMemoryWarningNotificationtriggers.criticaleviction; Android:ComponentCallbacks2.onTrimMemory(RUNNING_CRITICAL)andonLowMemory()trigger.criticaleviction
Project structure
capacitor-onnx/
├── src/ # TypeScript definitions + web stub
│ ├── definitions.ts # Public API types
│ ├── plugin.ts # Plugin registration
│ └── index.ts # Exports
├── android/
│ ├── src/main/.../onnx/ # Kotlin plugin implementation
│ │ ├── ONNXPlugin.kt # Capacitor bridge methods
│ │ ├── ONNXSessionManager.kt # Session cache + lifecycle
│ │ ├── ONNXSession.kt # Session + validation + TensorData
│ │ ├── ONNXEngine.kt # Engine interface
│ │ ├── OrtSessionEngine.kt # Production ORT wrapper
│ │ ├── ImagePreprocessor.kt # Pure image preprocessing
│ │ ├── AcceleratorSelector.kt # EP selection (NNAPI/XNNPACK/CPU)
│ │ ├── ONNXConfig.kt # Runtime config
│ │ └── ONNXError.kt # Error types
│ └── src/test/.../onnx/ # JUnit unit tests
│ ├── ONNXSessionManagerTest.kt # 9 O1 lifecycle tests
│ ├── ONNXInferenceTest.kt # 9 O2 inference tests
│ ├── ONNXPreprocessTest.kt # 8 O3 preprocessing tests
│ ├── ONNXAcceleratorTest.kt # 9 O4 accelerator tests
│ ├── ONNXRegistryTest.kt # 9 O5 registry/session lifecycle tests
│ └── ONNXPipelineTest.kt # 7 O6 pipeline tests
├── ios/
│ ├── Sources/ONNXPlugin/ # Swift plugin implementation
│ │ ├── ONNXPlugin.swift
│ │ ├── ONNXSessionManager.swift
│ │ ├── ONNXSession.swift # Session + validation + protobuf parser
│ │ ├── ONNXEngine.swift # Engine protocol
│ │ ├── ORTSessionEngine.swift # Production ORT wrapper
│ │ ├── ImagePreprocessor.swift # Pure image preprocessing
│ │ ├── AcceleratorSelector.swift # EP selection (CoreML/CPU) + cache
│ │ ├── ONNXConfig.swift
│ │ └── ONNXError.swift
│ └── Tests/ONNXPluginTests/ # XCTest unit tests + fixtures
│ ├── ONNXSessionManagerTests.swift # 9 O1 lifecycle tests
│ ├── ONNXInferenceTests.swift # 9 O2 inference tests
│ ├── ONNXPreprocessTests.swift # 8 O3 preprocessing tests
│ ├── ONNXAcceleratorTests.swift # 9 O4 accelerator tests
│ ├── ONNXRegistryTests.swift # 9 O5 registry/session lifecycle tests
│ └── ONNXPipelineTests.swift # 7 O6 pipeline tests
├── example/ # E2E test app
│ ├── www/index.html # Test runner UI (22 tests + YOLO demo)
│ ├── test-e2e-android.mjs # Android E2E runner (22 tests)
│ ├── test-e2e-ios.mjs # iOS E2E runner (22 tests)
│ ├── test-e2e-yolo-android.mjs # YOLO detection E2E (Android)
│ ├── test-e2e-yolo-ios.mjs # YOLO detection E2E (iOS)
│ └── capacitor.config.json
├── test/
│ ├── fixtures/tiny-test.onnx # Minimal Add model for E2E
│ └── generate-test-fixture.py # Generates tiny-test.onnx
├── package.json
├── DustCapacitorOnnx.podspec
└── tsconfig.jsonTesting
Test fixture
ios/Tests/ONNXPluginTests/Fixtures/tiny-test.onnx — a minimal ONNX model:
- Op:
Add(input_a, input_b) -> output - Shapes:
[1, 3]float32 for all tensors - Opset: 13, IR version 7
Regenerate with:
pip install onnx
python scripts/generate-test-fixture.pyUnit tests (51 per platform)
All unit tests use mock engines or injected factories — no real ONNX Runtime required.
| ID | Test | What it verifies |
|---|---|---|
| O1-T1 | Load valid path | Session creation with factory |
| O1-T2 | Metadata access | Input/output tensor names |
| O1-T3 | Missing file | fileNotFound error |
| O1-T4 | Corrupt file | loadFailed error |
| O1-T5 | Wrong format | formatUnsupported rejection before load |
| O1-T6 | Unload model | Cache cleared, listLoadedModels empty |
| O1-T6b | Unload unknown ID | modelNotFound error |
| O1-T7 | Load same ID twice | Ref count incremented, single session |
| O1-T8 | Load two models | Both IDs appear in list |
| O2-T1 | Float32 inference | Returns typed output tensor |
| O2-T2 | Uint8 inference | Preserves non-float tensor dtype |
| O2-T3 | Shape mismatch rank | Rejects with shapeError |
| O2-T4 | Shape mismatch dim | Rejects with shapeError |
| O2-T5 | Dynamic dimension | Accepts -1 metadata dims |
| O2-T6 | Dtype mismatch | Rejects with dtypeError |
| O2-T7 | Output filtering | Returns requested output subset |
| O2-T8 | Inference after unload | Maps to modelNotFound |
| O2-T9 | Engine failure | Maps to inferenceFailed |
| O3-T1 | Red image + ImageNet | Produces expected normalized RGB planes |
| O3-T2 | Letterbox resize | Preserves aspect ratio and centers content |
| O3-T3 | Upscale resize | Handles smaller source images safely |
| O3-T4 | minus1_plus1 | Maps white pixels to 1.0 |
| O3-T5 | zero_to_1 | Maps black pixels to 0.0 |
| O3-T6 | none | Preserves raw 0...255 channel values |
| O3-T7 | Invalid image data | Rejects with preprocessError |
| O3-T8 | Custom mean/std | Overrides preset normalization |
| O4-T1 | Auto accelerator | Config reaches factory / selects platform EP |
| O4-T2 | CPU accelerator | Metadata reflects cpu |
| O4-T3 | Platform EP explicit | CoreML (iOS) / NNAPI (Android) propagated |
| O4-T4 | Cached session reuse | Second load reuses session, not EP re-init |
| O4-T5 | Resolved accelerator | Metadata uses EP actually selected |
| O4-T6 | EP failure fallback | Falls back to CPU on EP init failure |
| O4-T7 | CPU loads without retry | Single factory call, no fallback path |
| O4-T8 | Both fail → LoadFailed | EP + CPU both fail → loadFailed error |
| O4-T9 | Metadata via lookup | getModelMetadata returns resolved accelerator |
| O5-T1 | Registry registration | Manager registered in DustCoreRegistry, resolvable |
| O5-T2 | Load ready descriptor | Session created via descriptor, refCount=1 |
| O5-T3 | Load notLoaded descriptor | Throws modelNotReady |
| O5-T4 | Load unregistered ID | Throws modelNotFound |
| O5-T5 | Unload keeps cached | refCount=0, session still in cache |
| O5-T6 | Load twice reuses | Same instance, refCount=2 |
| O5-T7 | Standard eviction | Background zero-ref removed, interactive kept |
| O5-T8 | Critical eviction | All zero-ref sessions removed |
| O5-T9 | allModelIds after evict | Only live session IDs returned |
| O6-T1 | Two-step pipeline | Both results returned, shapes correct, callCount == 2 |
| O6-T2 | Previous output chaining | Step 2 input substituted from step 0 output |
| O6-T3 | Explicit fromStep chaining | StepReference routes correct tensor |
| O6-T4 | Step 0 failure | Pipeline halts, error contains "step 0" |
| O6-T5 | Step 1 failure | Pipeline halts, error contains "step 1" |
| O6-T6 | Single-step equivalence | Pipeline result matches direct runInference |
| O6-T7 | Pipeline on evicted session | modelEvicted thrown before any run() call |
# Android (from example/android/)
ANDROID_HOME=/path/to/sdk ./gradlew :capacitor-onnx:test
# iOS (from capacitor-onnx/, on macOS with simulator)
xcodebuild test -scheme DustCapacitorOnnx \
-destination "platform=iOS Simulator,name=iPhone 16e" \
-skipPackagePluginValidationE2E tests (22 plugin tests)
The E2E tests run 22 scenarios in two phases on a real device/simulator with the actual ONNX Runtime:
- Phase 1 — Serve lifecycle (S.1–S.8): Register a tiny test model via dust-serve, download it from the test script's HTTP fixture server, verify events and status transitions, capture the serve-managed file path
- Phase 2 — ONNX API (O.1–O.14): Load/unload/inference tests using the serve-managed path from Phase 1. Error tests (missing file, corrupt file) use direct paths to test ONNX error handling
Both runners use an HTTP server on port 8099 to collect test results and serve model fixtures. The run-float32 test verifies real inference: input_a=[1,2,3] + input_b=[4,5,6] produces output=[5,7,9].
Android (device or emulator):
npm run test:androidiOS (simulator):
npm run test:iosThe runners handle the full pipeline: cap sync, build, install, fixture provisioning, and result collection.
Test results
| Suite | Count | Status | |---|---|---| | Android unit tests | 51 (9 O1 + 9 O2 + 8 O3 + 9 O4 + 9 O5 + 7 O6) | PASS | | iOS unit tests | 51 (9 O1 + 9 O2 + 8 O3 + 9 O4 + 9 O5 + 7 O6) | PASS | | Android E2E | 22 (8 serve + 14 ONNX) | PASS | | iOS E2E | 22 (8 serve + 14 ONNX) | PASS | | Android YOLO E2E | 5 detections via dust-serve | PASS | | iOS YOLO E2E | 5 detections via dust-serve | PASS |
YOLO E2E
The YOLO E2E tests run end-to-end object detection: the app registers and downloads yolo26s.onnx (~37 MB) through dust-serve, runs inference on a test image, and reports detections. The model is cached after the first download.
npm run test:yolo-ios # requires test:ios run first (app must be installed)
npm run test:yolo-android # requires test:android run firstInteractive YOLO demo
The example app has a Demo tab where you can load a YOLO model via dust-serve, pick an image from the gallery or camera, and run object detection interactively.
Using a different ONNX model
You can run any .onnx model — the plugin is not tied to YOLO. Here's how to swap it.
1. Pick a model
ONNX models are available from:
- ONNX Model Zoo — pre-trained vision, NLP, and audio models
- HuggingFace ONNX models — exported from PyTorch/TensorFlow
- Export your own with
torch.onnx.export()ortf2onnx
For mobile, keep models under 50 MB and prefer opset 13+ for broad ONNX Runtime 1.20 compatibility.
2. Load your model
const result = await ONNX.loadModel({
descriptor: {
id: 'my-model', // any string — used as the session key
format: 'onnx',
url: '/absolute/path/to/model.onnx',
},
config: {
accelerator: 'auto', // CoreML on iOS, NNAPI on Android, auto-fallback to CPU
threads: 4,
graphOptLevel: 'all',
memoryPattern: true,
},
});
// Inspect the model's expected inputs/outputs
console.log(result.metadata.inputs); // [{ name, dtype, shape }, ...]
console.log(result.metadata.outputs); // [{ name, dtype, shape }, ...]
console.log(result.metadata.accelerator); // 'coreml', 'nnapi', or 'cpu'3. Prepare inputs and run inference
Use the metadata to discover tensor names, shapes, and dtypes, then build inputs accordingly:
// For image models, use preprocessImage to get a ready-to-use NCHW tensor
const { tensor } = await ONNX.preprocessImage({
data: base64ImageNoPrefix, // raw base64, no data:image/... prefix
width: 640,
height: 640,
config: { resize: 'letterbox', normalization: 'zero_to_1' },
});
const result = await ONNX.runInference({
modelId: 'my-model',
inputs: [{ name: 'images', data: tensor.data, shape: [1, 3, 640, 640], dtype: 'float32' }],
});
// For non-image models, pass raw tensor data directly
const result2 = await ONNX.runInference({
modelId: 'my-model',
inputs: [{ name: 'input_ids', data: [101, 2023, 2003, 1037, 3231, 102], shape: [1, 6], dtype: 'int64' }],
});4. Configuration reference
| Config key | Default | What it does |
|-----------|---------|-------------|
| accelerator | 'auto' | Execution provider. 'auto' picks CoreML (iOS) or NNAPI (Android). Falls back to CPU transparently. |
| threads | (ORT default) | Thread count. Pass a number for both inter/intra-op, or { interOp: 2, intraOp: 4 } for fine control. |
| graphOptLevel | 'all' | Graph optimization level. 'all' applies all optimizations. Use 'disable' for debugging. |
| memoryPattern | true | Pre-allocate memory based on tensor shapes. Disable only if shapes vary wildly between runs. |
Caveats
Input tensor names and shapes must match exactly. Unlike LLM inference where you just provide a prompt, ONNX models require tensors with specific names, shapes, and dtypes. Use getModelMetadata() after loading to discover what the model expects. Mismatches produce shapeError or dtypeError before inference runs.
Dynamic dimensions accept any size, static dimensions don't. If metadata shows shape: [-1, 3, 224, 224], the first dimension is dynamic (batch size) and accepts any value, but the remaining three must be exactly 3, 224, 224.
accelerator: 'auto' may fall back to CPU silently. Not all models support CoreML or NNAPI — unsupported ops cause the EP to fail initialization. The plugin retries with CPU automatically. Check metadata.accelerator in the load result to see which EP was actually used.
CoreML first-load compiles the model (~5–15s). On iOS with CoreML, the first load compiles the ONNX graph to an internal format. This is cached in Application Support/onnx-cache/{modelId}/ and skipped on subsequent loads of the same model.
preprocessImage expects raw base64, not a data URL. Strip the data:image/jpeg;base64, prefix before passing. The tensor output is always [1, 3, H, W] NCHW float32 — verify this matches your model's expected input layout.
Large tensor data crosses the bridge as JSON arrays. Inference inputs and outputs are serialized as number[] over the Capacitor bridge. For very large tensors (e.g., high-resolution images), this can be slow. Use preprocessImage on the native side instead of sending raw pixel data from JavaScript.
ONNX Runtime version compatibility. The plugin bundles ONNX Runtime 1.20. Models exported with newer opsets may use ops not yet supported. Stick to opset 13–20 for best compatibility.
iOS sandbox: models must be in the app container. iOS apps can only read files inside their own sandbox. Use dust-serve to download models — it stores them in the app's data container automatically. If loading manually, the absolute path must point to a file inside the app's sandbox.
cap sync may regenerate patched files. If you manually set the iOS deployment target to 16.0 or Android minSdk to 26, cap sync can overwrite those changes. Re-apply patches after syncing. The E2E test scripts handle this automatically, but manual runs require awareness.
Development
# Build TypeScript
npm run build
# Lint
npm run lint
# Type check
npm run typecheckLicense
Copyright 2026 Rogelio Ruiz Perez. Licensed under the Apache License 2.0.
