ushman-vendor-tools
v0.7.0
Published
Byte-manipulation surface for ushman cleanup. Deterministic vendor mapping, N-voter consensus, vendor codemod, vendor-link, AST-aware rename, compose-patches, merge-returns, subdivide-briefs, whole-bundle tree-shake.
Maintainers
Readme
ushman-vendor-tools
The byte-manipulation surface for cleaned bundles. Deterministic vendor identifier mapping, N-voter consensus, vendor codemod (extract / rename-only / full-block-extract), vendor-link runtime resolution, AST-aware rename, scoped rename harness, compose-patches, merge-returns, subdivide-briefs, and whole-bundle tree-shake.
This package owns everything that mutates bundle bytes in the ushman pipeline. It does not capture screenshots, run parity, or talk to LLMs. Every byte-mutating operator goes through AST-driven edits — never regex on bundle text (the round-4 base64 corruption lesson). That includes the scoped rename harness used by function-scoped cleanup briefs: it selects one callable body, verifies declaration kind + collision safety, rewrites only the chosen binding family, and leaves on-disk writes to the caller.
Runtime: Bun-first. The package exposes a library API, but the current workspace, CLI, and file-system flows use Bun APIs directly. Treat Bun as required for development, testing, and the shipped CLI.
App-type: entirely app-type-agnostic. Vendor mapping, codemod, AST-aware rename, tree-shake all operate on bundle bytes / module graphs without any graphics-specific assumptions. Future browser-extension and non-graphical-web-app adapters reuse this package unchanged.
Package Layout
- Package-facing implementation lives under top-level
src/. - The old
src/core/**shim re-export layer is removed; import the real top-level modules instead. src/schema-types.tsis the vendored schema owner for this package surface. The narrowsrc/schema/entrypoints re-export those same instances for compatibility.
What this package is NOT
- Not a parity harness.
ushman parityowns behavioral equivalence verdicts. - Not an LLM brief generator.
ushmanorchestrator owns that. - Not a workspace state machine. State transitions stay in
ushman.
Install
npm i ushman-vendor-tools
bun add ushman-vendor-toolsQuick start
# Build a deterministic vendor mapping for a known package
ushman-vendor map deterministic <ws> --vendor=three --vendor-root=node_modules/three --vendor-version=0.180.0
# Apply a codemod that rewrites identifier references
ushman-vendor codemod apply <ws> --bundle=src/main.ts --mapping=mapping.json --mode=rename-only
# Wire the vendor at runtime via importmap
ushman-vendor link <ws> --target=stage-03-clean --strategy=importmap --vendor=three --subpaths=three,three/webgpu,three/tsl
# Tree-shake dead identifiers post-cleanup (m38)
ushman-vendor tree-shake <ws> --iterate --pre-snapshot-tag=pre-tree-shake-2026-05-07
# Emit AST-bounded cleanup briefs instead of naive byte buckets
ushman-vendor subdivide-briefs <ws> --slice-mode=ast-fn --max-region-bytes=8192
# Pack one dependency-coherent batch with stripped context attachments
ushman-vendor pack-for-agent <ws> --batch-mode=dependency --num-batches=4 --batch=1 --strip-context-deadcode
# Snapshot a recovered workspace instead of a single bundle
workspace-cleanup-progress <ws> --json --append --hotspots=10Submodules
| Submodule | Owns | Milestone |
|-----------|------|-----------|
| compose-patches | Disjoint byte-range patch composition with invariants | m31 |
| ast-aware-rename | Babel-driven byte-range rename pass (string-literal-safe) | round-4 base64 fix |
| vendor-mapping (deterministic) | Reference-build → bundle identifier matching | m22, m27, m36 |
| vendor-mapping-consensus | N-voter aggregation for LLM-proposed mappings | m23, m37 |
| vendor-codemod | extract / rename-only / full-block-extract modes | m24, m26 |
| vendor-block-detection | banner + bannerless region detection with split-required checks | m22, m36 |
| vendor-link | importmap / relative-paths / vite-rebuild runtime resolution | m25 |
| merge-returns | V4 role-based return merge (candidate/ + cleanup-progress.json) with ledger emission | v4 cutover |
| overlay-merge | Candidate-workspace overlay for decomposition return zips, manifest regeneration, verify + equiv gating | 2026-05-09 decomposition lane |
| subdivide-briefs | Per-region byte invariants, queue pruning | m16, m17, m18 |
| brief batching | Dependency-aware cleanup brief grouping and agent packs | m19, m20 |
| tree-shake | Whole-bundle AST dead-code elimination | m38 |
Operational docs:
Public API
See src/index.ts for the full surface. Top-level highlights:
composePatches(original: Buffer, patches: readonly Patch[]): ComposeResult
applyAstAwareRenames(opts): string
applyScopedRenameMap(source, renameMap, options): { applied, skipped, byteCount, rewritten }
acquireFileLock(filePath, options): Promise<{ release }>
buildDeterministicVendorMapping(opts): Promise<DeterministicMappingResult>
buildVendorMappingConsensus(opts): { mapping, disagreements }
applyVendorCodemod(opts): VendorCodemodResult
detectVendorBlocks(opts): readonly VendorBlockRegion[]
vendorLink(opts): Promise<VendorLinkResult>
mergeReturnedCleanupWork(opts): Promise<MergeResult>
mergeOverlayReturns(opts): Promise<OverlayMergeResult>
buildWorkspaceManifest(workspaceRoot): Promise<HandoffManifestDocument>
subdivideBriefs(opts): Promise<{ emitted, pruned }>
treeShakeBundle(opts): Promise<{ passes, bundlePath }>
computeBindingReachability(opts): { liveNames, deadNames }
packMultiBriefThreads(opts): Promise<{ variants, stageRoot }>
splitMultiBriefResponse(opts): Promise<{ parsed, threadDir }>
analyzeWorkspaceProgress(opts): Promise<WorkspaceCleanupProgressSnapshot>
loadEffectiveAllowlist(opts): Promise<LoadedCleanupProgressAllowlist>Merge validation hooks now use one callback shape: verifyWorkspace({ workspaceRoot, ignorePaths }). Anchored merge results expose outcome: "accepted" | "applied" | "rejected"; overlay merge results expose outcome: "accepted" | "rejected".
Scoped rename harness
Use applyScopedRenameMap when a cleanup brief must rename one binding family inside a single callable body without touching same-name bindings elsewhere in the file.
const result = applyScopedRenameMap(sourceBuffer, {
schemaVersion: 'ushman-rename-map/v1',
filePath: '/workspace/src/main.ts',
scopeConstraint: {
kind: 'callable-scope',
functionPath: 'helper11',
lineRange: { start: 999, end: 11671 },
},
renames: [{ current: 'i', proposed: 'segmentIndex', kind: 'let' }],
})Behavior notes:
- verifies the declaration kind before rewriting
- rejects same-scope collisions and outer-scope shadowing by default
- keeps string literals and other untouched bytes byte-stable by filtering AST-aware edits instead of regenerating code
- never writes
filePath; callers own snapshot/rollback policy
CLI
ushman-vendor map deterministic <ws> ...
ushman-vendor map consensus <ws> ...
ushman-vendor codemod apply <ws> ...
ushman-vendor extract <ws> ...
ushman-vendor link <ws> ...
ushman-vendor manifest write <ws>
ushman-vendor merge-returns <ws> ...
ushman-vendor merge-returns <ws> --mode=overlay --returns-dir=returns/ --equiv-baseline=.ushman/equiv-baseline.symbols.json --equiv-fixtures=.ushman/test-harness/fixtures
ushman-vendor merge-returns <ws> --mode=overlay --show-plan --returns-dir=returns/
ushman-vendor merge-returns <ws> --mode=overlay --write-manifest-only
ushman-vendor subdivide-briefs <ws> ...
ushman-vendor generate-cleanup-round <ws> ...
ushman-vendor cleanup-progress <ws> ...
ushman-vendor workspace-cleanup-progress <ws> ...
ushman-vendor list-briefs <ws> ...
ushman-vendor cleanup-briefs <ws> ...
ushman-vendor pack-briefs <ws> ...
ushman-vendor pack-for-agent <ws> ...
ushman-vendor split-multi-response --thread-dir=<dir>
ushman-vendor tree-shake <ws> ...Adapter-backed stage commands (ushman-vendor extract, ushman-vendor cleanup-briefs) are exposed here for compatibility, but standalone usage still requires orchestrator-side hook injection. In this repository they fail explicitly instead of silently guessing an adapter.
Workspace Cleanup Telemetry
Use cleanup-progress when the workspace still has one dominant bundle file and you want the original top-level declaration ratio for that single bundle. Use workspace-cleanup-progress after decomposition or when the candidate has turned into a real src/ tree and one bundle-level number no longer reflects the cleanup state.
The workspace analyzer walks src/**/*.{ts,tsx,js,jsx}, excludes src/donor/** and node_modules/** by default, reuses the existing per-file analyzeBundleProgress snapshot for each file, then adds workspace totals plus a hotspot list. Hotspots rank named functions, methods, and classes by the number of mangled nested declarations they still contain, so the operator can focus on the scopes that buy back the most readability next.
The classifier now accepts an allowlist from two sources:
- Package defaults in
src/cleanup-progress-allowlist.defaults.jsoncover conservative, cross-project short names such asfps,Fx,m11,vec3Node, and similar RE vocabulary that should not count as mangled by default. - Workspace overrides in
.lab/cleanup-progress-allowlist.jsonlet a candidate add project-specific semantic names or support aliases without forking the analyzer.
Precedence is explicit: workspace additions are merged on top of the package defaults, and workspace removeFromDefaults entries delete names from both default allowlist sets. A missing or malformed workspace allowlist never aborts the snapshot; the analyzer warns once and falls back to the package defaults.
The same effective allowlist now drives cleanup-round generation and single-thread mega-brief symbol cards, so telemetry and brief generation stop disagreeing about names like helper11.
Cleanup-stage placeholder names emitted by ushman's decompose/source-map recovery path, such as helper11, value47, config1, and _Component33, are classified as mangled with reason cleanup-placeholder. These names are temporary pipeline residue, not true semantic recoveries, so they count as work remaining even though they are longer than raw minifier output. They still appear in sampleMangled, and samplePlaceholders exposes a focused subset so operators can distinguish raw minifier residue from cleanup-pipeline placeholders without losing the full mangled sample.
When you compare new snapshots against history captured before this placeholder fix, expect a one-time step increase in mangled counts. That is a baseline correction, not a fresh cleanup regression; renderSnapshotReport, renderWorkspaceSnapshotReport, cleanup-progress, and workspace-cleanup-progress now flag that legacy-placeholder baseline explicitly so operators can re-baseline dashboards and use samplePlaceholders to audit how much of the jump came from placeholders versus raw short names.
JSON CLI output is wrapped in { schemaVersion, snapshot, comparison } with schema version ushman-workspace-cleanup-progress/v1. When --append is used, history lives in .lab/cleanup-progress-workspace-history.json with the same { snapshots: [...] } envelope style as the single-bundle history, but with WorkspaceCleanupProgressSnapshot payloads that already carry the per-file breakdowns the cockpit needs. If that history file is malformed, the CLI refuses to append, backs the corrupt file up to *.corrupt-<timestamp>.bak, and exits non-zero instead of overwriting operator history.
Testing
bun test
bun test src/overlay-merge.test.ts
bun test src/vendor-link.e2e.test.ts
bun run lint
bun run typecheckMerge Returns v4
mergeReturnedCleanupWork now treats returned zips as role-based v4 handoffs:
HANDOFF.jsonmust declareschemaVersion: "ushman-handoff/v4.0".- only
candidate/**plus rootcleanup-progress.jsonare mutable;context/,asl/, and.lab/are rejected. - returned
candidate/...paths are rebased onto the workspace root at apply time. - every
cleanup-progress.json:applied[]row emits anagent-patchledger entry under.lab/ledger/cleanup/.
The accepted return surface is therefore the root candidate tree (src/, public/, index.html, package.json, tsconfig.json, vite.config.ts) rather than the legacy stages/03-clean/ mirror.
Overlay Merge Mode
merge-returns --mode=overlay is the decomposition lane for return zips that replace a stage entrypoint plus add a src/ module tree, instead of sending anchored edits against a single monolith.
Overview:
- Auto-generates
.ushman/handoff-manifest.jsonwhen the workspace does not have one. - Rebuilds the candidate manifest before
ushman-verifyruns, so Tier0asees fresh hashes. - Reuses the current workspace manifest for changed-path candidate rewrites, so unchanged files are not re-hashed after every overlay apply.
- Validates the module entrypoint referenced by the workspace
index.html. - Synthesizes missing
cleanup-progress.json↔cleanup-briefs.jsonrows for completed overlay work. - Gates the candidate on
ushman-verifywith the equiv baseline path excluded from discovery, thenushman-equivtiersI/S/R.
Auto-normalization:
- Enabled by default.
- Strips operator-side trees such as
asl/,screenshots/,tools/,parity/, andnode_modules/. - Strips stale
.ushman/metadata,.DS_Store, and*.stub.json. - Ignores unsupported
*.deletedsentinels. - Restores
cleanup-briefs.jsonandcleanup-progress.jsonfrom the live workspace unless--no-auto-normalizeis passed.
Flags and outputs:
--reject-on-forbiddenconverts forbidden overlay paths into a hard rejection even when auto-normalization is enabled.--show-planprints the added, replaced, and removed file plan without applying anything.--write-manifest-onlyandushman-vendor manifest write <ws>regenerate.ushman/handoff-manifest.jsonwithout running a merge.--equiv-baselineand--equiv-fixturesmay point inside the workspace or at absolute external paths; external inputs are staged into a temporary workspace-local path for validation.- Equiv Tier
Lexists at the platform level but overlay merge currently skips it intentionally; the current validation lane only runs tiersI/S/R. OverlayMergeResultexposesplanplusnormalizationActionsso callers can inspect what was dropped or restored.
Rejection and acceptance:
- Expected rejection reasons are
manifest-stale,index-html-broken,verify-red, andequiv-red. - Unexpected infrastructure failures surface as
internal-error. - Green or yellow verify/equiv reports are accepted.
--applyalso creates a pre-merge git tag, stages disk-backed rollback backups under.ushman/.overlay-merge-backups/, records resumable rollback journals under.ushman/.overlay-merge-journals/, writes verify/equiv result files, and emits an overlay merge report under.ushman/merge-reports/.
Use ushman-vendor merge-returns --help for the full anchored vs overlay flag matrix.
The "no regex on bundle text" rule
Round-4 base64 corruption fix:
merge-returns's post-merge regex-based rename propagation (\b<from>\b) struck insideheightsB64andcolorsB64string literals becauseiYflanked by/and+in the base64 payload satisfies\b. Fixed by AST-driven byte-range edit pass.
Every byte-mutating operator in this package uses applyAstAwareRenames or its peers under compose-patches. Adding a regex-based mutator is a hard no.
Where this fits in the family
| | |
|---|---|
| Verified by | ushman-verify Tier 0c+0d+0e (pre/post each mutation) |
| Consumed by | ushman orchestrator (cleanup lane: merge-returns, subdivide-briefs, generate-cleanup-round) |
| Independent of | @ushman/characterize, @ushman/threejs-tools, @ushman/spector |
