agentflight
v0.5.1
Published
Local-first flight recorder for AI coding agents.
Maintainers
Readme
AgentFlight
See what your coding agent did. Prove it works. Know what to do next.
AgentFlight is a local-first flight recorder for AI coding agents from Baseframe Labs. It sits around Codex, Claude Code, Cursor, Windsurf, Gemini CLI, Aider, OpenCode, and similar tools so you can review the session instead of guessing what happened.
Website: baseframelabs.com/apps/agentflight
AgentFlight helps you:
- start an AI coding session
- capture verification evidence
- see changed files and risk
- create snapshots during the session
- generate a proof report
- generate a local replay timeline
- create a resume prompt for the next agent or reviewer

60-Second Workflow
npx agentflight@latest init
npx agentflight@latest start --task "Add password reset flow"
# Run Codex, Claude Code, Cursor, or your coding agent normally
npx agentflight@latest verify -- npm test
npx agentflight@latest snapshot --note "Initial implementation verified"
npx agentflight@latest status
npx agentflight@latest report
npx agentflight@latest replay
npx agentflight@latest resumeWhat you get:
initcreates local.agentflight/project files.startrecords the task, git branch, commit, dirty state, package manager, and tool availability.verify -- npm testruns the command and stores stdout, stderr, exit code, timing, and pass/fail status.snapshot --note "..."records the current git, risk, and proof state as a timeline event.statusanswers what changed, how risky it is, what proof exists, what proof is missing, and what to do next.reportwrites a Markdown proof report for review.replaywrites a local HTML timeline you can open in a browser.resumewrites a Codex/Claude-ready prompt for the next safe step.
Watch The Flow
AgentFlight turns a loose AI-agent session into a local proof trail:
- Start a session before you ask the coding agent to work.
- Capture real verification output with
agentflight verify. - Snapshot meaningful checkpoints.
- Read
statusto see changed files, risk, proof, gaps, and next action. - Generate
report,replay, andresumewhen the work is ready to review or hand off.
The replay artifact is a self-contained local HTML file. It leads with the review verdict, then lays out risk, review focus, proof gaps, the session timeline, and verification evidence (with inline failure excerpts, so you can see what broke without opening a log file) as a readable flight record:

A high-resolution still is also available at docs/assets/agentflight-replay-timeline.png.
Why This Exists
AI coding agents move fast. After a few prompts, you can lose track of:
- what changed
- whether the agent drifted from the task
- what was verified
- what failed
- what is safe to review
- how to resume the work later
AgentFlight gives you a local control room for that work. It records the session, captures proof, shows risk, and creates handoff artifacts without uploading source code.
Sample Outputs
agentflight status:
AgentFlight status
Task:
Add password reset flow
Changed files:
3
Risk: medium
- Dependency, backend, or unknown files changed.
Verification Evidence:
1 passed, 0 failed
Review first:
1. src/auth/reset.ts
Why: identity/session path; no passing test evidence
Focus: Check session, permission, and identity boundaries first.
Suggested proof: npm test
Proof gaps:
- blocking: Sensitive auth, payment, or security files changed without passing test evidence.
Latest snapshot:
- Note: Initial implementation verified
- Risk: medium
- Changed files: 3
Readiness: Needs verification
Reason: Sensitive auth, payment, or security files changed without passing test evidence.
Next action:
Run agentflight verify -- npm testagentflight report:
# AgentFlight Proof Report
## Review First
1. src/auth/reset.ts
- Why: identity/session path; no passing test evidence
## Verification Evidence
- passed: npm test
- stdout: .agentflight/evidence/.../verification-1.stdout.txt
- stderr: .agentflight/evidence/.../verification-1.stderr.txt
## Review Readiness
Needs verificationagentflight replay:
Replay saved:
.agentflight/reports/af-...-replay.html
Timeline:
session_started -> verification_passed -> snapshot_created -> replay_generatedagentflight resume:
Continue the AgentFlight session for: Add password reset flow
Latest snapshot:
Initial implementation verified
Verification state:
1 passed, 0 failed
Review focus:
src/auth/reset.ts - identity/session path
Guardrails:
- Stay scoped to the current task.
- Do not claim completion without proof.
- Run relevant verification before declaring success.Current Capabilities
The current AgentFlight release supports:
- local session setup
- active session tracking
- git branch, commit, dirty state, and changed file detection
- changed file risk categorisation
- review focus ranking for changed files
- proof gap detection and review readiness recommendations
- configurable generated/internal changed-file filters
- verification evidence capture with
agentflight verify - inline failure excerpts in the replay and report, so failures are visible without opening evidence files
- session events
- snapshots with
agentflight snapshot --note "..." - Markdown proof reports
- self-contained HTML replay timelines
- resume prompts for Codex, Claude Code, or a human reviewer
- doctor checks for local setup
- defensive ProjScan and AgentLoopKit adapters
- no telemetry, cloud sync, or source upload
What AgentFlight Is Not
AgentFlight is:
- not a coding agent
- not a cloud service
- not a replacement for tests
- not a security scanner
- not a CI platform
- not a code review replacement
Use your coding agent to make changes. Use AgentFlight to understand, verify, replay, and hand off the work.
How It Works Locally
AgentFlight creates a local .agentflight/ directory in your repo:
config.jsonstores local-first project settings.sessions/stores session metadata.current/stores the active session, handoff, and resume prompt.reports/stores Markdown proof reports and HTML replays.evidence/stores stdout and stderr from captured verification runs.
Sessions store an events timeline with meaningful moments such as session start, verification attempts, snapshots, and generated artifacts. Reports include filenames and summaries by default, not full source diffs.
Runtime session data is ignored by git by default in this repo:
.agentflight/sessions/.agentflight/reports/.agentflight/evidence/.agentflight/current/
.agentflight/config.json is intentionally not ignored, so a project can commit its local AgentFlight defaults when useful.
AgentFlight always excludes its own runtime session/report/current/evidence files from changed-file analysis. Additional generated or internal files can be ignored locally:
{
"changedFileFilters": {
"ignore": [".projscan-memory/**"]
}
}See docs/development/changed-file-filters.md.
Commands
agentflight initinitializes.agentflight/with safe writes.agentflight start --task "..."starts a session and writes the current handoff.agentflight statussummarizes changed files, risk, verification status, review focus, proof gaps, readiness, snapshots, and next action.agentflight verify -- <command>runs a proof command, records stdout/stderr evidence, and prints a small heartbeat while long commands are still active.agentflight verifyruns commands from.agentflight/config.json.agentflight snapshot --note "..."records current git, risk, and verification state as a timeline event.agentflight reportgenerates a Markdown proof report with review focus and readiness.agentflight replaygenerates a local self-contained HTML replay with review focus and proof gaps.agentflight resumeprints and saves a continuation prompt with the next safest action.agentflight doctorchecks local setup, scripts, tools, config, and current session state.
Future placeholders exist for upgrade, license, and login; AgentFlight Pro/Team is not available yet.
Powered By ProjScan And AgentLoopKit
AgentFlight is powered by two open engines from Baseframe Labs:
- ProjScan provides repo intelligence, risk analysis, codebase understanding, and preflight signals.
- AgentLoopKit provides task discipline, verification evidence, policies, and handoffs.
This repository dogfoods both tools. See docs/development/dogfooding.md.
Strategic architecture:
- ProjScan: repo intelligence engine
- AgentLoopKit: agent workflow discipline engine
- AgentFlight: commercial and user-facing experience layer
Example Session
Read docs/examples/basic-agentflight-session.md for a short password-reset walkthrough with status, report, replay, and resume artifacts.
Roadmap
Not built yet:
- cloud sync
- login
- billing
- GitHub App
- Team dashboards
- paid feature gates
Releases
AgentFlight uses npm Trusted Publishing from GitHub Actions for tagged releases. Pushes and pull requests run verification; npm publishes happen from v*.*.* tags.
See docs/development/release.md and CHANGELOG.md.
Contributing
Use the local verification loop before opening changes:
npm run verifyKeep changes scoped, local-first, and honest about proof. Do not claim tests passed unless they actually ran and passed.
License
Apache-2.0. See LICENSE.
