ushman-characterize
v0.4.0
Published
Per-function regression detection for refactor safety. Characterization tests with Three.js-aware trace-and-replay.
Downloads
131
Maintainers
Readme
ushman-characterize
Per-function regression detection for refactor safety. Characterization tests for the post-Ushman cleanup phase: classify pure functions, instrument every top-level function/method, capture per-call args/returns at runtime, generate a per-function bun-test that fails the moment a refactor drifts.
This is a generalization of Michael Feathers' characterization testing discipline (capture observed behavior of legacy code so a refactor can verify behavioral equivalence) plus trace-and-replay (instrument once, replay forever).
The trace serializer has Three.js-aware brand-checks — but those come from the optional ushman-threejs-tools peer dependency. For non-graphical donors (browser extensions, plain web apps, embedded webviews) the brand-checks are inert and the package operates on plain JS values. Graphics support is opt-in.
Status
m35 is implemented in the ushman orchestrator (ushman characterize …, src/core/characterize/). This repo is the standalone, link-ready extraction target for ushman-characterize.
Runtime: mixed. Instrumentation, replay, purity-classification, and generate-replay are pure Node. capture requires Puppeteer.
What this package is NOT
- Not a code-coverage tool.
c8/ Istanbul tell us "this line was hit"; refactor safety needs "this function still produces the same output." - Not a parity harness. That stays in
ushmanorchestrator'sparity/. - Not a snapshot tester. We do not write
expect.toMatchSnapshot()files; we capture traces that survive normal refactoring. - Not a mutation tester.
Install
bun install --global ushman-characterize
bun add ushman-characterizeBreaking Changes
- v4 fixtures moved from
.ushman/test-harness/to.lab/characterize/. - Generated tests now live under
tests/{pure,scene,replay}/. capturenow boots the candidate via Vite preview by default.- v3 workspaces are rejected with the standard v4 cutover stub hint.
Local Dev
- This repo currently depends on a sibling checkout of
ushman-ledgerviafile:../ushman-ledger. - For local installs to work, keep
ushman-characterizeandushman-ledgeras sibling directories under the same parent.
Quick start (the 4-phase loop)
ushman-characterize stub-pure <ws> --bundle=public/assets/bundle.js # Phase A
ushman-characterize stub-states <ws> # Phase B
ushman-characterize instrument --bundle=<in> --output=<out> --source-map=external # Phase C.1
ushman-characterize capture <ws> --bundle=<in> --states=0-lobby,1-game # Phase C.2
ushman-characterize generate-replay <ws> --bundle=<in> # Phase C.3
bun test tests/replay/ # run the generated testsAfter the LLM (or operator) refactors a class out of the monolith:
ushman-characterize rebind <ws> --map='.lab/characterize/modules/CameraController.mjs:CameraController=src/camera/CameraController.js'
bun test tests/replay/CameraController.*.test.tsPhases
| Phase | What | Effort to enable | Yield | |-------|------|------------------|-------| | A | Pure-function snapshot tests | <1 day | ~200+ tests on Capybara | | B | Scene-state snapshot tests | <1 day | ~one test per captured state | | C | Trace-and-replay (the main lift) | 5–7 days | ~400+ tests on Capybara | | D | Module-contract rebind after extraction | <1 day | Tests follow the refactor |
Public API
classifyPureTopLevelFunctions(opts): PureClassificationResult
readCapturedTraceState(opts): Promise<CapturedTraceRecord[]>
scaffoldPureCharacterizationTests(opts): Promise<{ pureFunctionCount, written, ... }>
scaffoldSceneCharacterizationTests(opts): Promise<{ written }>
populateScaffolds({
workspaceDir,
modulePathPrefix?,
symbolFilter?,
}): Promise<{ sidecarPath, written, skipped, synthesizedTraces, toolingGaps }>
populateSmokeScaffolds({
workspaceDir,
}): Promise<{ sidecarPath, written, skipped, toolingGaps }>
runStubPureCommand({
workspaceRoot,
bundlePath?,
regenStale?,
log,
}): Promise<...>
runStubStatesCommand({
workspaceRoot,
regenStale?,
log,
}): Promise<...>
canonicalizeSceneTree(value, opts?): unknown
instrumentBundle(opts): Promise<InstrumentResult>
captureCharacterization(opts): Promise<CaptureCharacterizationResult>
generateReplayCharacterization(opts): Promise<{ writtenTests, writtenFixtures }>
rewriteReplayImports(opts): Promise<{ files }>
createConsoleLogger(): Logger
canonicalizeTrace(value, opts?): unknown
type CaptureServerHost
type SceneInspectorDriverCLI
ushman-characterize --help
ushman-characterize stub-pure <workspace> --bundle=<path>
ushman-characterize stub-pure <workspace> --regen-stale
ushman-characterize stub-states <workspace>
ushman-characterize stub-states <workspace> --regen-stale
ushman-characterize instrument --bundle=<path> --output=<path> [--source-map=external|inline|off]
ushman-characterize capture <workspace> [--bundle=...] [--states=a,b] [--scene-only] [--capture-side-effects] [--mode=preview|dev]
ushman-characterize generate-replay <workspace> --bundle=<path> [--max-cases=10]
ushman-characterize replay <workspace> [--strict|--lenient] [--filter=<pattern>]
ushman-characterize rebind <workspace> --map='symbol=path' [--map='workspace/module.mjs:symbol=path'] [--map=...] [--dry-run] [--yes]Notes:
- Normal
capturerequires--bundle=<path>.--scene-onlyis the only mode that can omit it. - Characterize fixtures now live under
<ws>/.lab/characterize/and derived tests live under<ws>/tests/{pure,scene,replay}/. stub-pure --regen-stalepopulates seed-timetests/pure/**/*.test.tsscaffolds only when the current file hash still matchestests/.seed-fingerprints.json. Operator-edited files are skipped unchanged.stub-pure --regen-staleconsumes canonical module traces from<ws>/.lab/characterize/traces/<module>.jsonl. If only legacy state traces exist, characterize synthesizes the module trace once from unambiguous symbol matches and then reuses that file on later runs.stub-pure --regen-staletrusts the symbol list declared by existingtest.todo(...)blocks. The populator does not re-enumerate exports from source files.- Legacy state trace synthesis is intentionally conservative. If the same symbol name appears in more than one managed scaffold, characterize leaves those scaffolds untouched and records a tooling-gap instead of guessing which module owns the trace rows.
stub-states --regen-stalepopulates seed-timetests/smoke/<state>.test.tsscaffolds when<ws>/parity/baseline/<state>.pngexists,tests/smoke/{config,diff,drive}.tsalready exist, and the target scaffold hash still matches the sidecar.- Editing any auto-managed scaffold, even comments or whitespace, changes its fingerprint and makes future
--regen-staleruns skip that file on purpose. - V4 capture reads the state DAG only from
<ws>/.lab/state-dag.json. captureboots the candidate throughvite build && vite previewby default. Use--mode=devonly for a faster local smoke loop.instrumentpreserves source maps. The standalone CLI defaults to adjacent.mapfiles; in-process capture uses inline maps so browser stack traces point back at the pre-instrumented bundle.scene-onlycapture needs aSceneInspectorDriver. The CLI will auto-load one fromushman-threejs-tools/inspectorwhen that peer is installed.captureis partial-success aware: successful earlier states are kept, failed later states are reported with a yellow validator verdict, and oversized existing trace files fail only the affected state. The result now surfacesattemptedStates,completedStates, andskippedStatesalongsidefailures.replaywrites a versioned report to.lab/characterize/reports/verify-report.json.- Trace JSONL records and replay fixture JSON files are explicitly versioned. The generated replay harness rejects incompatible fixture/support versions with a regeneration hint.
capture,rebind, and seed-scaffold population emitvalidator-resultledger entries throughushman-ledgeron every run. Missing traces or smoke baselines are recorded astooling-gapnotes instead of silently generating empty tests.rebind --dry-runprints before/after import previews for generated replay tests.- Mixed replay imports are split per destination. Unmapped specifiers remain on the generated module import.
- Shorthand
rebind --map='symbol=path'only works when that symbol is unambiguous across generated replay imports. Use file-scoped mappings when the same symbol name appears in more than one generated module. - Anonymous default-exported callables are traced and replayed under a synthetic binding name. The default synthetic name is
defaultExport; if that name is already taken in the bundle, generation picksdefaultExport_2,defaultExport_3, and so on. - Async side-effect attribution survives native
awaitinside traced top-level functions, plus synchronous callbacks registered from an active traced chunk throughPromise.then,queueMicrotask,setTimeout,setInterval, andrequestAnimationFrame. - Unsupported async attribution is explicit: callbacks that later suspend with their own
await,for await...of/ async-iterator control flow, DOM/event-listener callbacks fired by unrelated user activity, worker/message boundaries, and shared callback registries are not treated as reliable ownership signals.
Seed scaffold sidecar format:
{
"schemaVersion": "shibuk-seed-fingerprints/v1",
"scaffolds": {
"tests/pure/util/math.test.ts": "<sha256>",
"tests/smoke/lobby.test.ts": "<sha256>"
}
}Serializer Contract
Trace canonicalization is shared across Node, the browser tracer, and the generated replay harness. The serializer currently covers:
- special numeric values (
NaN,Infinity,-Infinity,-0) - cycles and repeated references
ArrayBufferand typed-array viewsMap,Set, plain objects, and instance-like objects- Three.js-like values including
Vector2,Vector3,Vector4,Euler,Quaternion,Matrix4,Color,Object3D,BufferGeometry,Material, andTexture
By default floats are rounded to 4 decimals, object uuid fields are stripped from generic object payloads, each state trace is capped at 50 MB, and capture refuses to load more than 25,000 unique trace records for a single state without operator intervention.
Trace deduplication keys include the function identity, canonicalized arguments, receiver snapshot, return/throw payload, and captured side-effect shape. Calls that keep the same args and return value but arrive with a different thisArg snapshot or a different side-effect sequence are preserved as separate replay cases.
Trace JSONL records carry ushman.characterize.trace-record@1, replay fixtures carry ushman.characterize.replay-fixture@1, and both also store an explicit support version. When serializer or harness semantics change, bump the support version in lockstep with the replay report schema whenever old generated fixtures or harness files would no longer be trustworthy without regeneration.
Format Version Migration
When serializer, capture, or replay semantics change in a way that makes older traces untrustworthy:
- Bump
CHARACTERIZE_SUPPORT_VERSIONin src/constants.ts. - Re-run
ushman-characterize capture <ws> ...to regenerate trace JSONL files. - Re-run
ushman-characterize generate-replay <ws> ...to regenerate replay fixtures and generated tests. - Re-run
ushman-characterize replay <ws> --strictto confirm the regenerated harness is green.
Schema-version bumps are reserved for structural format changes to the trace or replay fixture documents. Support-version bumps cover semantic/runtime changes where old files still parse but should no longer be trusted without regeneration.
Why this exists separate from ushman
The decomposition campaign is regular SWE work post-Ushman, enabled by m35 (per IMPLEMENTATION-PLAN.md's m35 row). Cloud LLMs and CI integrations that drive refactors should be able to import the trace harness without cloning the entire reverse-engineering pipeline.
Where this fits in the family
| | |
|---|---|
| Depends on | ushman-threejs-tools (peer — for Vector3 / Quaternion / Object3D brand checks in the trace serializer) |
| Verified by | ushman-verify Tier 0c+0d (the instrumented bundle must Babel-parse and have no undefined references) |
| Runs alongside | ushman parity for the full safety net during a decomposition campaign |
