ushman-characterize

v0.4.0

Published

8 days ago

Per-function regression detection for refactor safety. Characterization tests with Three.js-aware trace-and-replay.

Downloads

131

0High
0Medium
0Low

ragaeeb

ushman characterization tdd refactor threejs trace-replay

ushman-characterize

Per-function regression detection for refactor safety. Characterization tests for the post-Ushman cleanup phase: classify pure functions, instrument every top-level function/method, capture per-call args/returns at runtime, generate a per-function bun-test that fails the moment a refactor drifts.

This is a generalization of Michael Feathers' characterization testing discipline (capture observed behavior of legacy code so a refactor can verify behavioral equivalence) plus trace-and-replay (instrument once, replay forever).

The trace serializer has Three.js-aware brand-checks — but those come from the optional ushman-threejs-tools peer dependency. For non-graphical donors (browser extensions, plain web apps, embedded webviews) the brand-checks are inert and the package operates on plain JS values. Graphics support is opt-in.

Status

m35 is implemented in the ushman orchestrator (ushman characterize …, src/core/characterize/). This repo is the standalone, link-ready extraction target for ushman-characterize. Runtime: mixed. Instrumentation, replay, purity-classification, and generate-replay are pure Node. capture requires Puppeteer.

What this package is NOT

Not a code-coverage tool. c8 / Istanbul tell us "this line was hit"; refactor safety needs "this function still produces the same output."
Not a parity harness. That stays in ushman orchestrator's parity/.
Not a snapshot tester. We do not write expect.toMatchSnapshot() files; we capture traces that survive normal refactoring.
Not a mutation tester.

Install

bun install --global ushman-characterize
bun add ushman-characterize

Breaking Changes

v4 fixtures moved from .ushman/test-harness/ to .lab/characterize/.
Generated tests now live under tests/{pure,scene,replay}/.
capture now boots the candidate via Vite preview by default.
v3 workspaces are rejected with the standard v4 cutover stub hint.

Local Dev

This repo currently depends on a sibling checkout of ushman-ledger via file:../ushman-ledger.
For local installs to work, keep ushman-characterize and ushman-ledger as sibling directories under the same parent.

Quick start (the 4-phase loop)

ushman-characterize stub-pure <ws> --bundle=public/assets/bundle.js   # Phase A
ushman-characterize stub-states <ws>                                  # Phase B
ushman-characterize instrument --bundle=<in> --output=<out> --source-map=external # Phase C.1
ushman-characterize capture <ws> --bundle=<in> --states=0-lobby,1-game # Phase C.2
ushman-characterize generate-replay <ws> --bundle=<in>                # Phase C.3
bun test tests/replay/                                                 # run the generated tests

After the LLM (or operator) refactors a class out of the monolith:

ushman-characterize rebind <ws> --map='.lab/characterize/modules/CameraController.mjs:CameraController=src/camera/CameraController.js'
bun test tests/replay/CameraController.*.test.ts

Phases

| Phase | What | Effort to enable | Yield | |-------|------|------------------|-------| | A | Pure-function snapshot tests | <1 day | ~200+ tests on Capybara | | B | Scene-state snapshot tests | <1 day | ~one test per captured state | | C | Trace-and-replay (the main lift) | 5–7 days | ~400+ tests on Capybara | | D | Module-contract rebind after extraction | <1 day | Tests follow the refactor |

Public API

classifyPureTopLevelFunctions(opts): PureClassificationResult
readCapturedTraceState(opts): Promise<CapturedTraceRecord[]>
scaffoldPureCharacterizationTests(opts): Promise<{ pureFunctionCount, written, ... }>
scaffoldSceneCharacterizationTests(opts): Promise<{ written }>
populateScaffolds({
  workspaceDir,
  modulePathPrefix?,
  symbolFilter?,
}): Promise<{ sidecarPath, written, skipped, synthesizedTraces, toolingGaps }>
populateSmokeScaffolds({
  workspaceDir,
}): Promise<{ sidecarPath, written, skipped, toolingGaps }>
runStubPureCommand({
  workspaceRoot,
  bundlePath?,
  regenStale?,
  log,
}): Promise<...>
runStubStatesCommand({
  workspaceRoot,
  regenStale?,
  log,
}): Promise<...>
canonicalizeSceneTree(value, opts?): unknown
instrumentBundle(opts): Promise<InstrumentResult>
captureCharacterization(opts): Promise<CaptureCharacterizationResult>
generateReplayCharacterization(opts): Promise<{ writtenTests, writtenFixtures }>
rewriteReplayImports(opts): Promise<{ files }>
createConsoleLogger(): Logger
canonicalizeTrace(value, opts?): unknown
type CaptureServerHost
type SceneInspectorDriver

CLI

ushman-characterize --help
ushman-characterize stub-pure <workspace> --bundle=<path>
ushman-characterize stub-pure <workspace> --regen-stale
ushman-characterize stub-states <workspace>
ushman-characterize stub-states <workspace> --regen-stale
ushman-characterize instrument --bundle=<path> --output=<path> [--source-map=external|inline|off]
ushman-characterize capture <workspace> [--bundle=...] [--states=a,b] [--scene-only] [--capture-side-effects] [--mode=preview|dev]
ushman-characterize generate-replay <workspace> --bundle=<path> [--max-cases=10]
ushman-characterize replay <workspace> [--strict|--lenient] [--filter=<pattern>]
ushman-characterize rebind <workspace> --map='symbol=path' [--map='workspace/module.mjs:symbol=path'] [--map=...] [--dry-run] [--yes]

Notes:

Normal capture requires --bundle=<path>. --scene-only is the only mode that can omit it.
Characterize fixtures now live under <ws>/.lab/characterize/ and derived tests live under <ws>/tests/{pure,scene,replay}/.
stub-pure --regen-stale populates seed-time tests/pure/**/*.test.ts scaffolds only when the current file hash still matches tests/.seed-fingerprints.json. Operator-edited files are skipped unchanged.
stub-pure --regen-stale consumes canonical module traces from <ws>/.lab/characterize/traces/<module>.jsonl. If only legacy state traces exist, characterize synthesizes the module trace once from unambiguous symbol matches and then reuses that file on later runs.
stub-pure --regen-stale trusts the symbol list declared by existing test.todo(...) blocks. The populator does not re-enumerate exports from source files.
Legacy state trace synthesis is intentionally conservative. If the same symbol name appears in more than one managed scaffold, characterize leaves those scaffolds untouched and records a tooling-gap instead of guessing which module owns the trace rows.
stub-states --regen-stale populates seed-time tests/smoke/<state>.test.ts scaffolds when <ws>/parity/baseline/<state>.png exists, tests/smoke/{config,diff,drive}.ts already exist, and the target scaffold hash still matches the sidecar.
Editing any auto-managed scaffold, even comments or whitespace, changes its fingerprint and makes future --regen-stale runs skip that file on purpose.
V4 capture reads the state DAG only from <ws>/.lab/state-dag.json.
capture boots the candidate through vite build && vite preview by default. Use --mode=dev only for a faster local smoke loop.
instrument preserves source maps. The standalone CLI defaults to adjacent .map files; in-process capture uses inline maps so browser stack traces point back at the pre-instrumented bundle.
scene-only capture needs a SceneInspectorDriver. The CLI will auto-load one from ushman-threejs-tools/inspector when that peer is installed.
capture is partial-success aware: successful earlier states are kept, failed later states are reported with a yellow validator verdict, and oversized existing trace files fail only the affected state. The result now surfaces attemptedStates, completedStates, and skippedStates alongside failures.
replay writes a versioned report to .lab/characterize/reports/verify-report.json.
Trace JSONL records and replay fixture JSON files are explicitly versioned. The generated replay harness rejects incompatible fixture/support versions with a regeneration hint.
capture, rebind, and seed-scaffold population emit validator-result ledger entries through ushman-ledger on every run. Missing traces or smoke baselines are recorded as tooling-gap notes instead of silently generating empty tests.
rebind --dry-run prints before/after import previews for generated replay tests.
Mixed replay imports are split per destination. Unmapped specifiers remain on the generated module import.
Shorthand rebind --map='symbol=path' only works when that symbol is unambiguous across generated replay imports. Use file-scoped mappings when the same symbol name appears in more than one generated module.
Anonymous default-exported callables are traced and replayed under a synthetic binding name. The default synthetic name is defaultExport; if that name is already taken in the bundle, generation picks defaultExport_2, defaultExport_3, and so on.
Async side-effect attribution survives native await inside traced top-level functions, plus synchronous callbacks registered from an active traced chunk through Promise.then, queueMicrotask, setTimeout, setInterval, and requestAnimationFrame.
Unsupported async attribution is explicit: callbacks that later suspend with their own await, for await...of / async-iterator control flow, DOM/event-listener callbacks fired by unrelated user activity, worker/message boundaries, and shared callback registries are not treated as reliable ownership signals.

Seed scaffold sidecar format:

{
  "schemaVersion": "shibuk-seed-fingerprints/v1",
  "scaffolds": {
    "tests/pure/util/math.test.ts": "<sha256>",
    "tests/smoke/lobby.test.ts": "<sha256>"
  }
}

Serializer Contract

Trace canonicalization is shared across Node, the browser tracer, and the generated replay harness. The serializer currently covers:

special numeric values (NaN, Infinity, -Infinity, -0)
cycles and repeated references
ArrayBuffer and typed-array views
Map, Set, plain objects, and instance-like objects
Three.js-like values including Vector2, Vector3, Vector4, Euler, Quaternion, Matrix4, Color, Object3D, BufferGeometry, Material, and Texture

By default floats are rounded to 4 decimals, object uuid fields are stripped from generic object payloads, each state trace is capped at 50 MB, and capture refuses to load more than 25,000 unique trace records for a single state without operator intervention.

Trace deduplication keys include the function identity, canonicalized arguments, receiver snapshot, return/throw payload, and captured side-effect shape. Calls that keep the same args and return value but arrive with a different thisArg snapshot or a different side-effect sequence are preserved as separate replay cases.

Trace JSONL records carry ushman.characterize.trace-record@1, replay fixtures carry ushman.characterize.replay-fixture@1, and both also store an explicit support version. When serializer or harness semantics change, bump the support version in lockstep with the replay report schema whenever old generated fixtures or harness files would no longer be trustworthy without regeneration.

Format Version Migration

When serializer, capture, or replay semantics change in a way that makes older traces untrustworthy:

Bump CHARACTERIZE_SUPPORT_VERSION in src/constants.ts.
Re-run ushman-characterize capture <ws> ... to regenerate trace JSONL files.
Re-run ushman-characterize generate-replay <ws> ... to regenerate replay fixtures and generated tests.
Re-run ushman-characterize replay <ws> --strict to confirm the regenerated harness is green.

Schema-version bumps are reserved for structural format changes to the trace or replay fixture documents. Support-version bumps cover semantic/runtime changes where old files still parse but should no longer be trusted without regeneration.

Why this exists separate from `ushman`

The decomposition campaign is regular SWE work post-Ushman, enabled by m35 (per IMPLEMENTATION-PLAN.md's m35 row). Cloud LLMs and CI integrations that drive refactors should be able to import the trace harness without cloning the entire reverse-engineering pipeline.

Where this fits in the family

| | | |---|---| | Depends on | ushman-threejs-tools (peer — for Vector3 / Quaternion / Object3D brand checks in the trace serializer) | | Verified by | ushman-verify Tier 0c+0d (the instrumented bundle must Babel-parse and have no undefined references) | | Runs alongside | ushman parity for the full safety net during a decomposition campaign |

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

ushman-characterize

Status

What this package is NOT

Install

Breaking Changes

Local Dev

Quick start (the 4-phase loop)

Phases

Public API

CLI

Serializer Contract

Format Version Migration

Why this exists separate from ushman

Where this fits in the family

Why this exists separate from `ushman`