ushman-recover

v0.4.0

Published

18 hours ago

Mechanical recovery audits, replay-surface drafts, and source-map-driven recovery helpers for manual migrations.

Downloads

134

0High
0Medium
0Low

ragaeeb

ushman recover sourcemaps replay manual-migration recovery-audit

ushman-recover

Mechanical recovery help for Ushman manual migrations.

ushman-recover consolidates the one-off local recovery scripts into one standalone package. In v4 it reads a shibuk-style workspace, scans asl/donor for source-map recoverability, and writes an advisory recovery bundle to .lab/recover/.

The output is always non-authoritative. This package is a migration aid and completeness report, not recovered-source provenance and not a parity verdict. Recovered files stay under .lab/recover/recovered/; operators review and promote them into src/ manually.

What this package is

A source-map and inline-surface recovery helper.
A draft replay-surface reader for handoff.json, graph/aliases.json, graph/content-types.json, and storage/boot.json.
An offline report generator for unresolved local imports, asset mirror planning, missing asset references, package inventory, and JS→TS risk markers.
A v4 workspace tool that emits advisory evidence plus ledger entries under .lab/.
A structural-only workspace-state report builder/loader for upstream orchestrators that need a current pipeline snapshot.

What this package is NOT

Not an AST-heavy deobfuscation engine.
Not a whole-program chunk-reconstruction system.
Not an aggressive JS→TS rewrite tool.
Not a backend recreation layer.
Not a browser parity harness. Behavioral verification stays in ushman.

Install

bun add ushman-recover

Quick start

ushman-recover /tmp/3d-chess-v4
ushman-recover /tmp/3d-chess-v4 --json

Default output bundle:

<workspace>/.lab/recover/
├── sourcemap-report.json
├── recovery-plan.json
├── inline-extraction-report.json
├── replay-surface.json             # when replay metadata exists
├── manifest.json
└── recovered/
    └── src/...

Legacy subcommands (audit, sources, inline) were removed from the CLI in the v4 cutover. v3 donor roots are refused; the CLI expects a v4 workspace root with asl/donor/.

Commands

`ushman-recover <workspace-root>`

Reads <workspace-root>/asl/donor/ and writes advisory recovery artifacts to <workspace-root>/.lab/recover/.

The default run writes:

sourcemap-report.json for discovered maps and completeness
recovery-plan.json for recoverable authored files and migration risks
inline-extraction-report.json for inline-script inspection or yield status
manifest.json for the output inventory
recovered/** for advisory recovered files

Options:

--dry-run computes the recovery result without writing .lab/recover or ledger entries
In dry runs, writtenFiles still reports the advisory file paths that would be written
--json prints the result object instead of a text summary
--ts-nocheck prepends // @ts-nocheck to recovered runtime modules
--help and --version behave as expected

Write-mode recovery acquires <workspace>/.lab/recover/.lock before it mutates recovery output or emits ledger entries. If another process already holds that lock, the command fails fast with an operator-visible error. Dry runs stay read-only and do not acquire the lock.

Inline donors

If .lab/capture/donor-classification.json marks the donor as inline-script, recover does not try to outdo shibuk’s inline splitter. It emits advisory reports, records the yield in inline-extraction-report.json, and leaves recovered/ empty.

If the same donor also contains recoverable sidecar source maps, those authored files are still recovered into .lab/recover/recovered/. The inline yield only suppresses inline extraction, not source-map recovery.

Reports

The recovery plan/reporting surface includes:

recovery mode classification
source-map coverage summary and per-map inventory
authored files recoverable from sourcesContent
source conflicts when multiple source maps disagree on a file
missing local imports
asset mirror planning for donor assets that exist on disk
missing asset references
package inventory with inferred versions when evidence exists
JS→TS risk markers
inline split suggestions or yield status
replay-surface metadata when replay-package files exist
ledger entries for the tool invocation and each recovered advisory file

Workspace-state report

buildWorkspaceStateReport() and loadWorkspaceStateReport() cover the ongoing structural snapshot that sits beside the intake-time recovery reports. The shape is intentionally pipeline-owned only:

operator-authored prose stays in the ledger as note/change-log entries
narrativePointers only points at that ledger narrative with entry ids and counts
there is no conflict-detection or merge-from-existing behavior because operators do not hand-edit this file

The loader looks for workspace-state-report.json at the workspace root. The current CLI does not write it yet; upstream orchestration owns when that snapshot is emitted.

When to use it

Use this report when an upstream orchestrator needs a compact "where is this workspace now?" snapshot between stage transitions. This is not the intake-time recovery bundle and it is not a hand-edited operator log:

runWorkspaceRecovery() and the intake-time reports answer what the donor scan recovered right now
buildWorkspaceStateReport() answers what the broader migration pipeline currently believes about the workspace
ledger note and change-log entries still hold the narrative half such as cleanup waves, verified flows, and open issues

The intended integration point is an orchestrator that:

computes the structural snapshot fields from the current workspace state
resolves the latest ledger entry ids and open-issue counts
writes the report as a pipeline-owned snapshot

Example with multiple populated sections:

const workspaceState = buildWorkspaceStateReport({
    workspace: {
        generatedAt: new Date().toISOString(),
        name: '3d-chess-v4',
        pipelineVersion: '0.3.0',
        root: '/tmp/3d-chess-v4',
    },
    donor: {
        entryBundle: 'public/assets/app.js',
        originUrl: 'https://example.test',
        sourceMaps: {
            count: 4,
            present: true,
        },
    },
    candidate: {
        status: 'in-cleanup',
        lastCleanupRound: 3,
        strategy: 'vendor-extract-then-cleanup',
    },
    dependencies: {
        confirmed: [{ name: 'react', version: '19.2.0' }],
        extractedShims: [{ name: 'buffer', path: 'src/shims/buffer.ts' }],
        stillEmbedded: [{ guess: 'three', reason: 'vendor chunk strings' }],
    },
    protectedSurfaces: {
        assetUrlCount: 12,
        routeStrings: ['/play', '/settings'],
        shaderUniforms: ['uResolution', 'uTime'],
        storageKeys: ['session', 'token'],
        websocketEvents: ['join-room', 'sync-state'],
    },
    narrativePointers: {
        latestCleanupWaveEntryId: 'note-20260523-000123',
        latestVerifiedFlowEntryId: 'note-20260523-000124',
        lastChangeLogEntryId: 'change-20260523-000125',
        openIssueCount: 2,
    },
});

narrativePointers is the bridge back to the ledger. The state report should point at narrative entries, not duplicate their prose.

Workflow

flowchart TD
    A["workspace root"] --> B["asl/donor scan"]
    B --> C["source map discovery"]
    B --> D["inline inspection"]
    B --> E["replay metadata read"]
    C --> F["recovery plan"]
    D --> G["yield / inline report"]
    E --> H["replay-surface draft"]
    F --> I["recovered advisory files"]
    F --> J["sourcemap-report.json"]
    F --> K["recovery-plan.json"]
    G --> L["inline-extraction-report.json"]
    H --> M["replay-surface.json"]
    I --> N["ledger entries"]
    J --> O["manifest.json"]
    K --> O
    L --> O
    M --> O
    N --> O

Public API

Top-level entry points from src/index.ts:

import {
    buildSourceRecoveryPlan,
    buildWorkspaceStateReport,
    inspectInlineScripts,
    loadWorkspaceStateReport,
    runRecoveryAudit,
    readReplaySurfaceDraft,
    runWorkspaceRecovery,
} from 'ushman-recover';

const recovery = await runWorkspaceRecovery({ workspaceRoot: '/tmp/3d-chess-v4' });
const audit = await runRecoveryAudit({ inputRoot: '/tmp/3d-chess-v4/asl/donor' });
const plan = await buildSourceRecoveryPlan('/tmp/3d-chess-v4/asl/donor');
const inline = await inspectInlineScripts('/tmp/3d-chess-v4/asl/donor');
const replaySurface = await readReplaySurfaceDraft('/tmp/3d-chess-v4/asl/donor');
const workspaceState = buildWorkspaceStateReport({
    workspace: {
        generatedAt: new Date().toISOString(),
        name: '3d-chess-v4',
        pipelineVersion: '0.3.0',
        root: '/tmp/3d-chess-v4',
    },
});
const persistedWorkspaceState = await loadWorkspaceStateReport('/tmp/3d-chess-v4');

runWorkspaceRecovery() is the supported CLI-aligned v4 workflow. runRecoveryAudit() remains public as a low-level donor-root helper for library callers that want a non-writing audit/report surface.

Shared context

runRecoveryAudit() and runWorkspaceRecovery() both build the same donor-root RecoveryContext internally. That shared context performs one file-inventory walk and fans the result out to source recovery, inline inspection, and replay-surface reads so the audit and recovery paths stay aligned without duplicating donor scans.

Replay-surface draft

readReplaySurfaceDraft() is still available as a low-level helper. The authoritative schema still lives in the parent ushman orchestrator. In v4 the default CLI writes replay-surface.json into .lab/recover/ when donor replay metadata exists, and the API also returns the parsed draft.

Ledger schema

Each successful non-dry-run recovery writes capture-phase ledger entries under <workspace>/.lab/ledger/capture/.

Current emitted shapes:

tool-invocation
- summary: "run workspace recovery"
- runId
- details.recoverableFileCount
- details.writtenFileCount
- links.outputDir
- links.sourceMapReportPath
- links.recoveryPlanPath
note
- subkind: "automation"
- summary: "Recovered advisory file <relativePath>"
- runId
- links.donorPath
- links.recoveredPath

The ledger remains append-only across runs. Entries are grouped by runId; a dry run emits no ledger entries.

The capture directory also keeps two pipeline-owned internals:

.tail-state
- committed tail metadata: latest ledger entry id plus the next sequence number
.tail-pending
- crash-recovery marker for an entry that was written before the tail metadata advanced

Recovery path when metadata and on-disk entries disagree:

if .tail-pending points at an entry file that exists, recover commits that pending tail and clears the marker
if .tail-pending points at a missing entry file, recover drops the stale marker and keeps the previous committed tail
if .tail-state is missing, malformed, or points at a pruned entry, recover rescans the ledger directory once, rebuilds the tail metadata, and then returns to the steady-state metadata path

Error catalog

Common user-facing failures:

Workspace root is not a directory
- The CLI was pointed at a file path instead of a workspace directory.
Expected a v4 workspace root with ...
- The workspace is missing asl/donor/ or .lab/lab.json.
Invalid donor classification at .lab/capture/donor-classification.json
- Shibuk metadata is malformed or from an unexpected schema version.
Refusing to write outside ...
- A source-map path normalized outside the recovery output root.

Performance notes

File walking, source-map reads, inline inspection, and recovery writes use a default concurrency of 8.
Ledger appends use .tail-state on the steady-state path and only rescan .lab/ledger/capture/ when metadata is absent or stale.
Replay-surface corridor scans cap file reads at 256 KiB per candidate file.
Inline data-URL source-map discovery scans the last 8 KiB of container files first and only falls back to a full read when the tail scan finds no sourceMappingURL.

Migrating from v3

The CLI no longer accepts audit, sources, or inline subcommands.
The supported entrypoint is ushman-recover <workspace-root>.
Recovery output is no longer written into workspace-root recovered-* directories.
Advisory files stay under .lab/recover/recovered/ and must be promoted manually.

Package inventory notes

Package inventory is intentionally offline-by-default. This package infers versions from vendored evidence and local package.json data when available, but it does not hit the network to classify registry status or publication visibility during the core flow. Those fields stay unknown unless upstream orchestration enriches them.

Extending version inference

Version rules are registered in src/version-inference-registry.ts, not in the main reducer loop. Each strategy declares:

packageScope
matcher
precedence
description

Keep precedence stable:

recovered package.json metadata first
vendor heuristics after that
manifest fallback versions last in inferPackageVersions()

Add new rules by inserting a strategy into the registry with explicit precedence and fixture-backed tests for ordering and deterministic matches.

Where this fits in the family

| | | |---|---| | Consumed by | ushman intake / seed / migration workflows | | Adjacent to | ushman-verify (pre-flight), ushman-analyze (deterministic static checks), ushman-vendor-tools (byte mutation) | | Does not replace | ushman parity, ushman sweep, or any authoritative stage schema in the parent orchestrator |

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

ushman-recover

What this package is

What this package is NOT

Install

Quick start

Commands

ushman-recover <workspace-root>

Inline donors

Reports

Workspace-state report

When to use it

Workflow

Public API

Shared context

Replay-surface draft

Ledger schema

Error catalog

Performance notes

Migrating from v3

Package inventory notes

Extending version inference

Where this fits in the family

`ushman-recover <workspace-root>`