hankweave-trace

v0.0.2

Published

19 days ago

Observability adapter for Hankweave — uploads execution traces to Braintrust and Langfuse

0High
0Medium
0Low

hankweave-trace

Upload Hankweave execution traces to Braintrust and Langfuse for visual inspection, cost tracking, and team-wide dashboards.

# Upload a completed run
npx hankweave-trace upload ./my-execution-dir

# Watch a live execution and stream spans in real-time
npx hankweave-trace watch ./my-execution-dir

# Generate provider JSON without uploading (no credentials needed)
npx hankweave-trace generate ./my-execution-dir --braintrust

What it does

Hankweave produces detailed execution logs — event journals, per-codon transcripts, structured state — but this data lives as flat files:

Point hankweave-trace at a running or completed execution directory, and you get:

Each hank run becomes one trace. Codons, LLM calls, tool calls, rig setups (with per-command stdout/stderr), sentinels, loops, and iterations each get their own span with correct parent-child relationships, tokens, costs, errors, and timing.

Setup

Option 1: Config file (recommended for teams)

Create .hankweave-trace.json in your project root (or ~/.config/hankweave-trace/config.json globally):

{
  "braintrust": {
    "apiKey": "$BRAINTRUST_API_KEY",
    "project": "My Hanks"
  },
  "langfuse": {
    "publicKey": "$LANGFUSE_PUBLIC_KEY",
    "secretKey": "$LANGFUSE_SECRET_KEY",
    "baseUrl": "http://your-langfuse:3000"
  },
  "tags": ["production"]
}

Values starting with $ are resolved as env vars — commit the file safely, keep secrets in your shell profile.

Option 2: Environment variables

export HANKWEAVE_TRACE_BRAINTRUST_API_KEY=sk-...
export HANKWEAVE_TRACE_LANGFUSE_PUBLIC_KEY=pk-...
export HANKWEAVE_TRACE_LANGFUSE_SECRET_KEY=sk-...
export HANKWEAVE_TRACE_LANGFUSE_BASE_URL=http://your-langfuse:3000

Both methods combine: CLI flags > env vars > config file > defaults. If both platforms are configured, both get the trace.

Usage

`upload` — post-hoc

hankweave-trace upload <execution-dir> [flags]

| Flag | Description | | ------------------ | ------------------------------------------------ | | --braintrust | Upload to Braintrust only | | --langfuse | Upload to Langfuse only | | --project <name> | Braintrust project name (default: "Hankweave") | | --dry-run | Output spans as JSON without uploading | | --latest-only | Upload only the latest run (default: all runs) | | --redact | Strip content, keep structure and metrics | | --force | Re-upload even if dedup marker exists | | --tags <a,b,c> | Extra tags on all traces |

`generate` — inspect payloads

Generate the provider-specific JSON payload without uploading. No API credentials required.

hankweave-trace generate <execution-dir> --braintrust [flags]
hankweave-trace generate <execution-dir> --langfuse   [flags]

Stdout is pure JSON (pipe to jq, save to file). Progress goes to stderr.

hankweave-trace generate ./exec-dir --braintrust | jq length
hankweave-trace generate ./exec-dir --langfuse | jq '.[0].type'
hankweave-trace generate ./exec-dir --braintrust > payload.json

`watch` — real-time

hankweave-trace watch <execution-dir> [flags]

Waits for the execution directory to appear
Streams spans in real-time as codons execute
Prints live trace URLs for your browser
Does a final complete upload when the run finishes
Detects crashed processes (via PID check + lock file)

How it works

hankweave-trace reads three data sources from a Hankweave execution directory:

| Source | What it provides | | -------------------------------- | -------------------------------------------------------------------- | | .hankweave/state.json | Run metadata, codon states, costs, tokens, checkpoints | | .hankweave/events/events.jsonl | Orchestration events — rigs, sentinels, loops, errors | | .hankweave/runs/{runId}/*.log | Per-codon agent transcripts — LLM calls, tool calls, thinking blocks |

Spans are converted to each platform's native format:

Braintrust: flat spans with span_attributes.type, metrics for tokens, error for failures
Langfuse: typed events (trace-create, generation-create, span-create) with totalCost overrides and level: "ERROR" for failures

Token accuracy

Claude's prompt caching means raw token counts produce wrong costs. hankweave-trace handles this:

Braintrust: prompt_tokens = total input (non-cached + cache creation + cache read), not just non-cached tokens (which causes negative costs). Authoritative cost in metadata.hankweaveCost.
Langfuse: totalCost override on every generation, preventing the ~5x overestimate from Langfuse's own pricing model.

Shim harnesses

For non-Claude agents (Codex, Gemini, Pi, OpenCode), per-message token breakdown isn't available. Tokens go on the codon span; child LLM/tool spans show the conversation structure but carry zero tokens.

Idempotent uploads

All span IDs are deterministic (SHA-256 of run data). Re-uploading produces the same IDs — both platforms upsert. A marker file .hankweave/tracing-marker.json prevents accidental re-uploads (override with --force).

Configuration reference

Config file

Searched in order: ./.hankweave-trace.json → ~/.config/hankweave-trace/config.json.

{
  "braintrust": { "apiKey": "$BRAINTRUST_API_KEY", "project": "Hankweave" },
  "langfuse": {
    "publicKey": "$LANGFUSE_PUBLIC_KEY",
    "secretKey": "$LANGFUSE_SECRET_KEY",
    "baseUrl": "https://cloud.langfuse.com"
  },
  "tags": ["my-team"],
  "redact": false
}

Environment variables

| Variable | Purpose | | ------------------------------------- | ----------------------------------------------------------- | | HANKWEAVE_TRACE_BRAINTRUST_API_KEY | Braintrust API key | | HANKWEAVE_TRACE_BRAINTRUST_PROJECT | Braintrust project name (default: "Hankweave") | | HANKWEAVE_TRACE_LANGFUSE_PUBLIC_KEY | Langfuse public key | | HANKWEAVE_TRACE_LANGFUSE_SECRET_KEY | Langfuse secret key | | HANKWEAVE_TRACE_LANGFUSE_BASE_URL | Langfuse server URL (default: https://cloud.langfuse.com) | | HANKWEAVE_TRACE_TAGS | Comma-separated default tags | | HANKWEAVE_TRACE_REDACT | 1 or true to strip content by default |

Development

git clone ... hankweave-tracing && cd hankweave-tracing

# Link hankweave for type reference (optional — types are inlined)
cd ../hankweave-3 && bun link
cd ../hankweave-tracing && bun link hankweave && bun install

bun run tc            # Type-check
bun run lint          # Lint
bun run build         # Build to dist/

bun run src/index.ts upload <dir> --dry-run
bun run src/index.ts watch <dir>
bun run src/index.ts generate <dir> --braintrust

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

hankweave-trace

What it does

Setup

Option 1: Config file (recommended for teams)

Option 2: Environment variables

Usage

upload — post-hoc

generate — inspect payloads

watch — real-time

How it works

Token accuracy

Shim harnesses

Idempotent uploads

Configuration reference

Config file

Environment variables

Development

`upload` — post-hoc

`generate` — inspect payloads

`watch` — real-time