percept-edge
v0.6.0
Published
On-device reactive edge for the Percept SDK: VAD + motion gate + the WatchSpec/Tier0Signal wire-contract.
Downloads
136
Maintainers
Readme
percept-edge
⚠️ Research preview (v0.6.0). Published for real-life testing + feedback, not production. APIs may change between
0.xreleases — pin a version.
The on-device reactive edge for the Percept SDK — the cheap, high-fps reflex that runs next to the camera. It does two jobs:
- gates — decides when a frame is worth an expensive cloud vision call (motion + VAD salience), so the cloud VLM runs only on the frames that matter (~83% fewer calls in practice);
- runs the brain-selected edge skills — cheap detectors that propose timed events the cognition core counts (reps, sound onsets) without a cloud call.
Pure ESM JavaScript, zero runtime dependencies (vitest is the only devDep). The DSP skills + the gate need no model and no native deps; the one model-backed skill (pose) takes a pluggable adapter you supply.
The wire contract (FROZEN — byte-for-byte with the Python SDK)
down (cloud → edge):
WatchSpec { conditions: WatchCondition[], version }
WatchCondition { goal_id, anchor_text, threshold, region_hint: Box|null, detectors: string[] }
up (edge → cloud):
EdgeCapabilities { capabilities: string[] } # advertise, at connect
Tier0Signal { motion, structural, semantic, ts } # per-frame salience GAUGE
BoundaryReport { boundaries: [{ goal_id, detector, kind, t, value }] } # per-event, when a skill firesdetectors carries the brain's skill selection; EdgeCapabilities advertises what this edge can run so
the brain selects only within it (the capability handshake). Every shape has make*/parse* validators
in src/contract.js, kept identical to the Python percept-harness dataclasses — verified by a
cross-language parity harness (12/12 byte-for-byte; the JS and Python detectors + gate produce identical
output on identical input).
The gate (the reflex)
import { Edge, makeFrame } from "percept-edge";
const edge = new Edge({ onSignal: (s) => console.log("salience", s) });
edge.applyWatchSpec(/* the brain's WatchSpec JSON */);
edge.processFrame(makeFrame(8, 8, 100)); // first frame: no history → no signal
edge.processFrame(makeFrame(8, 8, 100)); // still scene → gated out (null), no cloud call
edge.processFrame(makeFrame(8, 8, 240)); // motion clears threshold → Tier0Signal upmotionScore(prev, frame) (src/motion.js) is a deterministic mean-absolute pixel-diff over a minimal
grayscale frame ({ width, height, data: Uint8Array }), normalized to [0,1]. VAD lives in src/vad.js
(EnergyVad + a SileroVadAdapter slot).
The skill registry (the on-device "skills")
Cheap, high-fps detectors the brain selects per goal. Each consumes a signal channel and emits timed
boundaries the brain counts. The JS twin of percept_harness/detectors.py:
| skill | channel | proposes | model? |
|---|---|---|---|
| motion-periodicity | motion | one peak per rep (counting) | no — pure DSP |
| acoustic-onset | energy | one onset per sound starting | no — pure DSP |
| pose-openness | frame | one peak per whole-body rep | yes — pluggable landmarker |
When a WatchSpec selects detectors, the edge runs them every frame and emits boundaries up a separate
channel (decoupled from the per-frame gauge):
const edge = new Edge({ onBoundary: (batch) => batch.forEach(b => count(b.goal_id)) });
edge.applyWatchSpec(/* WatchSpec with detectors:["motion-periodicity"] */);
for (const frame of stream) edge.processFrame(frame, { energy }); // boundaries fire on the rising edgesModel-backed skills are lazy — await edge.loadSkills() loads them once on activation, never at
construct. edge.capabilities() / edge.advertise() report the registered set up the wire.
Per-device customization — register only what you need
The same runtime, different skill sets (a drone, AR glasses, and a fitness coach each register what they
need). A skill is registerSkill({ name, signal, manifest, factory }), nothing more:
import { registerPoseSkill } from "percept-edge/pose";
registerPoseSkill(myLandmarker); // opt-in: supply your pose model adapter (MediaPipe Tasks for Web, TF.js…)Deployment topologies
- A — all server-side: the Python
percept-harnessruns the gate + skills in-process with the brain. Nopercept-edge. - B — on-device:
percept-edgegates + runs skills on the device, advertisesEdgeCapabilities, and sendsTier0Signal+BoundaryReportup; the server maps boundaries toperceive_boundary. Use this to keep frames on-device until something matters (bandwidth + privacy).
Known tuning (it's a research preview)
The motion-gate threshold (default 0.10) is a cost↔recall knob: well-calibrated for salient motion,
but it skips very-low-motion onsets (measured edge event-recall ≈ 0 on subtle actions — the gate never
escalates, so the VLM never sees them). Tune the threshold per motion regime for your use case. This is a
deliberate post-publish tuning item, not a blocker.
Test
npm install
npm test # 70 vitest: contract, motion, vad, skills, pose, edgeApache-2.0.
