@ubox-lib/ubox-human
v1.1.2
Published
Webcam pose detection add-on for Ubox — translates Human.js output to the Ubox 14-joint format
Maintainers
Readme
Ubox Human
An add-on for ubox-machine that uses the webcam instead of a Kinect sensor. It runs Human.js in the background, detects user skeletons from the camera feed, and feeds them into PhygitalMove using the same data format — making it a drop-in replacement for Kinect input.
CDN
Load after ubox-machine, then call UboxHuman.init() after Ubox.setSensor():
<script src="https://unpkg.com/@ubox-lib/ubox-machine/ubox-machine.min.js"></script>
<script>
const machine = Ubox.setMachine({
/* ... */
});
const sensor = Ubox.setSensor({ machine: machine });
machine.init();
</script>
<script src="https://unpkg.com/@ubox-lib/ubox-human/ubox-human.min.js"></script>
<script>
UboxHuman.init();
</script>How It Works
Call UboxHuman.init(config) to start the library. It:
- Fetches and loads Human.js from CDN.
- Requests camera access from the browser.
- Starts a continuous detection loop on the webcam feed.
- Calls
window.getSkeletons(set byUbox.setSensor()) with skeleton data on each frame.
Load order matters.
Ubox.setSensor()must be called beforeUboxHuman.init()because ubox-human feeds data intowindow.getSkeletons, whichsetSensorassigns. If the sensor is not set up first, skeleton data will be silently dropped.
Configuration
UboxHuman.init(config) accepts an optional config object that is deep-merged with the defaults. You only need to specify the keys you want to override.
UboxHuman.init({
body: { maxDetected: 2 }, // detect up to 2 people
filter: { flip: false }, // disable mirror mode
backend: "webgl", // use WebGL instead of WebGPU
});Default config:
{
backend: "webgpu",
modelBasePath: "https://vladmandic.github.io/human-models/models",
debug: false,
object: { enabled: false },
body: {
enabled: true,
maxDetected: 1,
modelPath: "https://vladmandic.github.io/human-models/models/movenet-thunder.json",
},
face: { enabled: false },
hand: { enabled: false },
gesture: { enabled: false },
filter: { enabled: false, flip: true },
segmentation: { enabled: false },
}Any key from the Human.js config reference can be overridden.
Multi-Person Detection
Set body.maxDetected above 1 to detect multiple people simultaneously:
UboxHuman.init({ body: { maxDetected: 4 } });When maxDetected > 1, ubox-human uses a nearest-neighbor approximation to maintain stable skeleton IDs across frames:
- Each person is tracked by their hip centroid position from the previous frame.
- A new detection is matched to the nearest known person; unmatched detections receive a new ID.
- A person's ID is preserved for up to 30 frames of occlusion before being expired.
- When a person re-enters after expiry, they receive a new ID.
When maxDetected === 1 (the default), IDs are assigned by index with no tracking overhead.
Camera Controls
| Key | Action |
| --- | ------------------------------------------------ |
| f | Flip the camera feed horizontally (mirror mode). |
Output Format
The skeleton data produced by ubox-human is identical in shape to what a Kinect sensor sends through PhygitalMove. The same userCallback, skeleton, and user objects apply — refer to the PhygitalMove skeleton documentation for the full data structure.
Joint index reference (same 14 joints as PhygitalMove):
| Index | Reference | Name | | ----- | -------------- | --------------- | | 0 | HAND_RIGHT | Right hand | | 1 | ELBOW_RIGHT | Right elbow | | 2 | SHOULDER_RIGHT | Right shoulder | | 3 | HAND_LEFT | Left hand | | 4 | ELBOW_LEFT | Left elbow | | 5 | SHOULDER_LEFT | Left shoulder | | 6 | HEAD | Head | | 7 | SPINE_NECK | Neck | | 8 | SPINE_CENTER | Center of spine | | 9 | HIP_CENTER | Center of hip | | 10 | KNEE_LEFT | Left knee | | 11 | KNEE_RIGHT | Right knee | | 12 | ANKLE_LEFT | Left ankle | | 13 | ANKLE_RIGHT | Right ankle |
Joints
HEAD,SPINE_NECK,SPINE_CENTER, andHIP_CENTERare synthesized by averaging nearby detected keypoints — they are not directly detected by the camera model.
Differences from Kinect
| | Kinect (PhygitalMove) | Webcam (ubox-human) |
| ------------ | ------------------------------ | ------------------------------------------------------------- |
| Skeleton IDs | Sensor-assigned integers | 999x range (9990, 9991, …); stable across frames when maxDetected > 1 |
| z (depth) | Accurate meter distance (0–4m) | Not available — always 2 |
| Joint origin | Direct sensor measurement | MoveNet Thunder model via Human.js |
| Backend | Native Kinect SDK | WebGPU (configurable) |
