npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@shpitdev/codexharness

v0.0.5

Published

Codex conductor, nanny, and TUI runtime harness.

Downloads

347

Readme

codex-orchestration

Bun TypeScript Biome

Browser-observable multi-agent orchestration on Codex app-server.

What This Repo Does

  • codex-conductor: runs multi-agent implementation loops (solutionlead -> engineer -> tester) and captures full run telemetry.
  • codex-nanny: separate thread watcher for human Codex sessions; sends idle nudges and follow-up prompts.
  • Browser monitor: live + rewind visualization of run events, handoffs, messages, and checkpoints.

The runtime is app-server only (no SDK backend path).

How Conductor Works

  1. Start a run from a spec file or --prompt.
  2. Runner executes role turns through Codex app-server threads.
  3. Every event is persisted to run artifacts (events.jsonl, per-role logs, turn snapshots, reports).
  4. A local monitor service serves the web viewer from apps/web build artifacts for live state and rewind.
  5. Monitor stays alive after completion so runs can be reviewed later.

Getting Started

bun install
bun link

After bun link, commands are available globally:

  • codex-conductor
  • codex-nanny
  • codex-tui

Build host-native CLI binaries:

bun run build:cli

Outputs:

  • dist/codex-conductor
  • dist/codex-nanny
  • dist/codex-tui
  • dist/monitor-web/ (staged copy of apps/web/dist used by monitor serving)

Compiled binaries include runner/nanny/TUI internals (no runtime dependency on src/*.ts paths).

Fresh-machine binary flow check (isolated bundle, no source-path dependency):

bun run test:e2e:fresh-binary -- --dist-dir ./dist

npm packaging plan and publish gates:

docs/npm-packaging-plan.md
RELEASE.md

Target public package: @shpitdev/codexharness.

Monorepo Scaffold

The repo now includes first-pass app/package boundaries for the long-term split:

  • apps/web — Solid web monitor viewer (bun run dev:web)
  • apps/tui — OpenTUI run list + status badges + live events tail + turn detail inspector + final gate panel (bun run dev:tui)
  • packages/core — core runner/state/policy/audit runtime
  • packages/cli — conductor/nanny/monitor command and daemon surfaces
  • packages/monitor-api — monitor run/event discovery + API response shaping

Runtime now lives in packages/core + packages/cli.

Current extracted core modules:

  • packages/core/src/state.ts
  • packages/core/src/threadTypes.ts
  • packages/core/src/audit.ts
  • packages/core/src/policy.ts
  • packages/core/src/runDirs.ts
  • packages/core/src/threadBackend.ts
  • packages/core/src/appServerClient.ts
  • packages/core/src/threadEvents.ts
  • packages/core/src/threadBackendAppServer.ts
  • packages/core/src/evidence.ts
  • packages/core/src/artifacts.ts
  • packages/core/src/io.ts
  • packages/core/src/report.ts
  • packages/core/src/runner.ts
  • packages/core/src/agentDocs.ts
  • packages/core/src/chime.ts
  • packages/core/src/cliArgs.ts
  • packages/core/src/env.ts
  • packages/core/src/gitignore.ts
  • packages/core/src/schema.ts
  • packages/core/src/threadState.ts
  • packages/core/src/todos.ts

Current extracted CLI modules:

  • packages/cli/src/codexConductor.ts
  • packages/cli/src/codexNanny.ts
  • packages/cli/src/conductorMonitor.ts
  • packages/cli/src/nanny.ts
  • packages/cli/src/nannyPolicy.ts
  • packages/cli/src/nannyState.ts
  • packages/cli/src/thread-cli.ts
  • packages/cli/src/report-cli.ts

Current extracted monitor API modules:

  • packages/monitor-api/src/index.ts

Source-root compatibility shims have been removed; scripts/tests now import package modules directly.

Quickstart

Run conductor in current folder:

codex-conductor -p "implement X in this repo"

Run from spec:

codex-conductor specs/v1/example.md --fresh

Defaults:

  • workdir: current directory
  • monitor: auto-start detached local daemon process
  • model: gpt-5.3-codex-spark
  • reasoning effort: xhigh
  • verification gate: tester must provide verification evidence when should_test=true

At run end, conductor prints the monitor URL for that run.

Usage:

codex-conductor <spec-file|-p|--prompt ...> [runner flags]
codex-conductor monitor <start|status|stop|open> [--workdir <path>] [--port <n>]
codex-conductor tui [--workdir <path>] [--monitor-port <n>] [--monitor-base-url <url>]
codex-tui [--workdir <path>] [--monitor-port <n>] [--monitor-base-url <url>]

Model override example:

codex-conductor --prompt "implement X" --model gpt-5.3-codex-spark --model-reasoning-effort xhigh

Reviewing A Run

  1. Start a run with codex-conductor.
  2. Open the printed monitor URL (or run codex-conductor monitor open --workdir .).
  3. Use the rewind slider for event-by-event replay.
  4. Select stage nodes or trace cards to inspect parsed turn detail (trigger/actions/output/next).
  5. Switch Trace/Raw to compare consolidated turn cards vs raw event rows.
  6. Expand Todo Overview to inspect latest per-role todos and todo update history.
  7. Review final artifacts under .runner-state/runs/<runId>/.

Monitor Commands

codex-conductor monitor status
codex-conductor monitor start --workdir . --port 42427
codex-conductor monitor open --workdir .
codex-conductor monitor stop --workdir .

Notes:

  • Monitor home lists recent run metadata with stage/status and prompt previews.

TUI Command

codex-conductor tui --workdir .
codex-tui --workdir .
codex-tui --monitor-base-url http://127.0.0.1:42427

Notes:

  • codex-conductor tui and codex-tui auto-start monitor for --workdir unless --monitor-base-url is provided.

Nanny (Separate Interaction)

Thread watcher examples:

bun run nanny -- --dry-run --once
bun run nanny -- --idle-seconds 240 --cooldown-seconds 900

Tmux launcher examples:

codex-nanny .
codex-nanny --workdir . -- --model gpt-5.3-codex

codex-nanny starts a tmux session with two panes:

  • left pane: nanny monitor process
  • right pane: Codex interactive session

Local State And Artifacts

Per target repo:

  • <workdir>/.runner-state/
  • <workdir>/.runner-state/runs/<runId>/

Per-run artifacts:

  • manifest.json
  • state.json
  • events.jsonl (canonical timeline)
  • events-by-role/*.jsonl
  • turn-*.json
  • report.md / report.json / mermaid outputs

Web monitor assets are built once under apps/web/dist (and staged to dist/monitor-web by bun run build:cli).

Regenerate report:

bun run report -- --artifacts .runner-state --svg

Validation

Typecheck:

bunx tsc -p tsconfig.json --noEmit

Tests:

bun run test:unit

Unit tests are intentionally unit/component scope only. They do not claim full real-run end-to-end verification.

Unit test files follow *.unit.test.ts under tests/.

Manual end-to-end scenario stub:

docs/scenarios/real-run-e2e.scenario.stub.md

Real-run e2e harness (executes a real prompt, then asserts high-level scenario checks):

bun run test:e2e:real -- --workdir . --prompt "implement X and verify"

Run the real e2e harness against a compiled conductor binary artifact:

bun run test:e2e:real -- --workdir . --prompt "implement X and verify" --conductor-bin ./dist/codex-conductor

Optional expected output artifact check:

bun run test:e2e:real -- --workdir . --run-id <runId> --expected-output-path output/result.json

PTY-driven TUI e2e harness (real monitor API + real OpenTUI process in a pseudo-terminal):

bun run test:e2e:tui

Keep the seeded workdir for debugging:

bun run test:e2e:tui -- --keep-workdir

Hard real-generation validation:

scripts/validate-real-generations.sh /path/to/target/repo

CI notes:

  • CI / build-cli builds dist/codex-conductor + dist/codex-nanny + dist/codex-tui and uploads them as workflow artifacts.
  • CI / build-cli also runs bun run test:e2e:fresh-binary -- --dist-dir ./dist before artifact upload.
  • CI / e2e-tui-pty runs PTY-driven TUI e2e (bun run test:e2e:tui) and uploads .memory/tui-pty-e2e logs/artifacts.
  • CI / e2e-real-binary runs on every PR using --conductor-bin ./dist/codex-conductor, cost-tuned model candidates (codex-mini-latest first), low reasoning effort, bounded retries, and publishes full command output + run artifacts.
  • CI / e2e-real-binary requires repository/org secret OPENAI_API_KEY so codex app-server can run in CI.
  • CodeQL must stay green (no new code scanning alerts on changed code).
  • required check list for branch protection: docs/required-checks.md

CLI productization roadmap:

docs/cli-roadmap.md

Advanced Direct Runner

bun run runner -- specs/v1/example.md --workdir .

Screenshots

Conductor monitor:

Codex Conductor monitor

Nanny interaction:

Codex Nanny interaction

Thread Lifecycle CLI

bun run threads -- list --limit 20
bun run threads -- read --thread-id <threadId> --include-turns
bun run threads -- archive --thread-id <threadId>
bun run threads -- unarchive --thread-id <threadId>
bun run threads -- compact --thread-id <threadId>