@reactvision/react-viro-onnx
v1.0.0
Published
ONNX Runtime inference provider for @reactvision/react-viro ViroObjectDetector
Readme
ViroReact ONNX, By ReactVision
On-device YOLOE object-detection inference for ViroReact, powered by ONNX Runtime. This package is the inference engine behind ViroObjectDetector: it runs the model (inference, NMS, class-name decoding) fully on-device. ViroObjectDetector handles the camera and plumbing — without this provider, detection returns empty.
MIT licensed and free forever.
Requires
@reactvision/react-viro≥ 2.57.0 (the release that shipsViroObjectDetector). Works with both React Native CLI and Expo projects.
How it works
- iOS: a vendored, dynamically-linked
onnxruntime.xcframework. TheViroONNXObjective-C++ class registers an inference block intoVRTObjectDetectorViewautomatically via+loadwhen the framework is loaded — no manual call needed. - Android: the
onnxruntime-androidAAR.ViroONNXModuleregisters the provider through React Native module init, and creates the ORT session with the NNAPI execution provider (USE_FP16) so inference can run on the device GPU/NPU/DSP, falling back to CPU if NNAPI is unavailable or can't compile the graph.
Both sides: run the model, apply confidence threshold + greedy NMS (IoU 0.45), sort by confidence, decode class indices to names from the model's names metadata, and return up to 50 detections (the view trims further to its maxDetections prop).
Installation
npm install @reactvision/react-viro @reactvision/react-viro-onnxAdd both plugins to your app.json (this one after @reactvision/react-viro):
{
"expo": {
"plugins": [
"@reactvision/react-viro",
"@reactvision/react-viro-onnx"
]
}
}The config plugin:
- iOS: inserts
pod 'ViroReactONNX'into the app target's Podfile (after the React Native pods, so it doesn't disturbuse_react_native!). On firstpod installit downloadsonnxruntime.xcframework(~60 MB, cached, not committed). - Android: adds
implementation 'com.microsoft.onnxruntime:onnxruntime-android:1.22.0'to the appbuild.gradle.
Then rebuild the native app (npx expo run:ios / run:android). On iOS, confirm in the logs that no [ViroONNX] … not found error appears — the provider registers silently on success.
Bundle your model too — the detector loads
.onnxby name from the native bundle, not Metro. See Bundle a model. A missing model surfaces asmodel not foundat runtime.
Local development (consuming this package from source)
If the app installs this package from a packed tarball (e.g. "@reactvision/react-viro-onnx": "file:../path/react-viro-onnx-1.0.0.tgz"), then node_modules holds a snapshot — editing the source here does not reach the app until you re-pack and reinstall:
# in this package (after editing native/JS or the config plugin):
npm run build # only if you changed TS (dist/ + plugin/build/)
npm pack # regenerates react-viro-onnx-1.0.0.tgz
# in the app:
rm -rf node_modules/@reactvision/react-viro-onnx
npm install <path-to>/react-viro-onnx-1.0.0.tgzSymptoms of a stale tarball: a config-plugin resolution error during expo prebuild (no app.plugin.js in node_modules), or native changes (e.g. NNAPI) never taking effect / no ViroONNX log lines. To skip re-packing during active dev, point the dep at the folder (file:../path/react-viro-onnx) instead of the tarball.
Bundle a model
Ship an .onnx next to your app and reference it by name via the model prop. See model bundling. The prompt-free yoloe-26n model carries 4,585 classes; its label names are read from the ONNX names metadata at load time.
Exporting a text-prompt model
The stock prompt-free model has poor recall on specific common classes (it rarely emits "cup", "keyboard", etc. confidently). For high-recall detection of your classes, export a text-prompt (RepRTA) model that bakes your class list into the detection head via CLIP text embeddings.
scripts/export_text_model.py does this:
# In a Python env with torch (e.g. a venv):
pip install ultralytics
python scripts/export_text_model.py
# → yoloe-26n-seg.onnx with your CLASSES baked in (downloads weights + mobileclip text encoder on first run)Edit the CLASSES list in the script to your target classes, re-run, then bundle the resulting .onnx (rename as you like, e.g. yoloe-26n-text.onnx) and point the model prop at it.
The export keeps the same output format as the prompt-free model (output0 [1,300,38], end2end NMS, segment task) and writes your class list into the names metadata, so no native changes are needed — the provider reads the new names automatically.
Because the head is reparametrized to your classes, the model emits only those classes. You usually just run it in prompt-free mode:
<ViroObjectDetector model="yoloe-26n-text" mode="prompt-free" />mode="text" is not what bakes in your classes — that happens here, at export time. text mode is only a runtime label post-filter; add it (with categories) on top of the exported model when you want to narrow the output to a subset of the baked classes:
<ViroObjectDetector
model="yoloe-26n-text"
mode="text"
categories={["cup", "laptop", "keyboard"]} // a subset of the exported CLASSES
/>Static, not dynamic: the class set is fixed at export time. Changing classes means re-exporting. Fully dynamic runtime text prompts would require bundling the CLIP text encoder + a model that accepts embedding inputs (not currently implemented).
API
The provider registers itself automatically when the native pod/AAR is linked — there's nothing to call. The only exposed helper is a version probe:
import { ViroONNX } from "@reactvision/react-viro-onnx";
ViroONNX.getVersion(); // ONNX Runtime version linked into the app (iOS: 1.20.0, Android: 1.22.0)Performance
Inference is the per-frame bottleneck. maxFPS throttles how often it runs; the camera keeps rendering at native FPS regardless.
- Android creates the session with the NNAPI EP (
USE_FP16). Whether that actually offloads to GPU/NPU depends on the device's NNAPI drivers and how many YOLOE ops they support — unsupported ops (e.g. the end2end NMS) fall back to CPU. Check the logs:ONNX Runtime inference provider registered.— module loaded.ORT session ready (NNAPI=true|false) …— whether the NNAPI EP was applied.infer run=<ms>ms— wall-clock per inference. This is the number to watch.
- If NNAPI doesn't help on a given device, the next levers (largest first): INT8 quantization of the model (export-time), then lower input resolution (640 → 480/320). Both trade a little accuracy for 2–4×.
- The model is
yoloe-26n(nano); larger variants are slower.
Platform parity
iOS is the reference implementation. Android has parity for AR-session detection: NMS, class names, text-mode filtering, maxDetections, center-square crop, and an aligned screenBoundingBox in dp (the renderer feeds the detector the full uncropped frame + the viewport crop rectangle, the Android equivalent of iOS's displayTransform). Remaining gaps:
worldPosition(3D hit-test) is not yet emitted on Android.- Android AR sees the central ~55–60% of the vertical FOV (center-square crop of a portrait frame) vs iOS cropping a landscape sensor frame.
See the platform table in the component docs.
Documentation
ViroObjectDetectorcomponent reference: https://github.com/ReactVision/viro/blob/main/docs/ViroObjectDetector.md- ViroReact docs: https://viro-community.readme.io/docs/overview
Community
Discord is the best place to find the team and other developers building with ViroReact:
Find Out More
- Website: https://reactvision.xyz
- ViroReact: https://reactvision.xyz/viro-react
- ReactVision Studio: https://studio.reactvision.xyz
- Blog: https://updates.reactvision.xyz
MIT licensed. © ReactVision, Inc.
