@mauriciobenjamin700/ort-vision-sdk-web
v0.2.1
Published
High-level TypeScript SDK for browser computer vision inference with ONNX Runtime Web.
Maintainers
Readme
@mauriciobenjamin700/ort-vision-sdk-web
High-level TypeScript SDK for browser computer vision inference on top of ONNX Runtime Web.
Mirrors the Python ort-vision-sdk API: task-oriented classes (Classifier, Detector) that handle image loading, preprocessing, execution-provider selection and postprocessing. Output is the same typed shape as the Python version (ClassificationResult, DetectionResult, BoundingBox).
Installation
npm install @mauriciobenjamin700/ort-vision-sdk-web onnxruntime-webonnxruntime-web is a peer dependency — you bring your own version and ship the matching .wasm files yourself.
Quick start
Image classification
import { Classifier } from "@mauriciobenjamin700/ort-vision-sdk-web";
const clf = await Classifier.create("/models/resnet50.onnx", {
labels: ["tench", "goldfish", /* ... 1000 ImageNet labels */],
});
const result = await clf.predict("/images/dog.jpg", { topK: 5 });
console.log(result.className, result.confidence);
console.log(result.probabilities);
// result.image is an RGBImage (HWC RGB Uint8Array) — the original input.Object detection
import { Detector } from "@mauriciobenjamin700/ort-vision-sdk-web";
// labels defaults to "coco" (80 classes)
const det = await Detector.create("/models/yolov8n.onnx");
const detections = await det.predict("/images/street.jpg", {
confThreshold: 0.4,
});
for (const d of detections) {
console.log(d.className, d.confidence, d.bbox.asXyxy());
// d.croppedImage is an RGBImage of just that bounding box region.
}Accepted image inputs
predict(image) and loadImage(image) both accept:
string— a URL fetched viafetch().Blob/File— for<input type="file">uploads.HTMLImageElement— an existing<img>tag.HTMLCanvasElement/OffscreenCanvas— already-rendered canvas.ImageBitmap— fromcreateImageBitmap().ImageData— raw pixel buffer (RGBA from canvasgetImageData()).RGBImage— the SDK's canonical HWC RGB Uint8Array wrapper.
Execution providers
The default provider order is ["webgpu", "wasm"] — ONNX Runtime tries WebGPU first and silently falls back to WebAssembly if WebGPU isn't available. You can override per task:
const clf = await Classifier.create(model, {
labels,
providers: ["wasm"], // force CPU
});For WebGPU to actually engage you need a recent ORT-Web build, a Chromium-based browser with WebGPU enabled, and either secure context (https:// or localhost) or the right COOP/COEP headers if you also want SharedArrayBuffer-based wasm threading.
Status
This project is alpha — the public API is stable enough to build against, but minor versions may introduce breaking changes during the pre-1.0 phase. Pin the version range you build against.
- Source code & issues: https://github.com/mauriciobenjamin700/ort-vision-sdk
- Changelog: https://github.com/mauriciobenjamin700/ort-vision-sdk/blob/main/sdk-js-web/CHANGELOG.md
- Python counterpart:
ort-vision-sdk
License
MIT — see LICENSE.
