autonomous-agent-nightshift

v1.9.0

Published

21 days ago

Run Claude Code agents overnight on your codebase. Interactive terminal UI (Ink) + bash harness + Claude Code skill + slash commands. Write a todo file with Implementation+Validation pairs, launch, wake up to validated code.

0High
0Medium
0Low

noluyorabi

claude-code claude-code-skill autonomous-agent agent-loop overnight-agent nightshift ai-coding tdd bash cli

Autonomous Agent Nightshift

Distilled from production runs on multiple SaaS codebases (Next.js, Bun, Supabase, Stripe). Adapters documented for Python, Go, Rust.

How it works

Per-task pipeline:

Phase 1: Implement   Claude writes code + tests
Phase 2: Validate    prettier . tsc . eslint . test runner
  -> Fix loop        up to MAX_FIX_ATTEMPTS (default 7)
Phase 3: Chrome      optional live browser test via Claude-in-Chrome MCP
  -> Fix loop        up to MAX_CHROME_FIX_ATTEMPTS (default 3)
Phase 4: Mark done   checkbox [x] in todo file

Failed tasks get [x] -- NEEDS MANUAL REVIEW and the loop moves on. You triage in the morning.

Install

Pick whichever channel matches your tooling. Full walkthrough: USAGE.md.

npm install -g autonomous-agent-nightshift
# or one-shot:
npx autonomous-agent-nightshift init my-feature

brew install noluyorAbi/tap/nightshift
# or directly from repo (no tap yet):
brew install --HEAD https://raw.githubusercontent.com/noluyorAbi/autonomous-agent-nightshift/main/Formula/nightshift.rb

curl -fsSL https://raw.githubusercontent.com/noluyorAbi/autonomous-agent-nightshift/main/bin/install.sh | bash

/plugin marketplace add noluyorAbi/autonomous-agent-nightshift
/plugin install autonomous-agent-nightshift

curl -fsSL https://github.com/noluyorAbi/autonomous-agent-nightshift/releases/latest/download/autonomous-agent-nightshift.tar.gz | tar xz

git clone https://github.com/noluyorAbi/autonomous-agent-nightshift

[!NOTE] For the plugin path, /plugin marketplace add only registers the source. You must run /plugin install autonomous-agent-nightshift, then fully quit and relaunch Claude Code. The same restart applies if you install as a skill (curl one-liner or git clone into ~/.claude/skills/).

The CLI tool, skill, and slash commands are orthogonal — you can use any combination. For full power, install via npm/brew (CLI) and the curl one-liner (skill+commands).

CLI

After npm install -g or brew install, the nightshift command is on your PATH:

nightshift init add-dark-mode       # bootstrap todo + runner in cwd
nightshift start                    # launch detached
nightshift tail                     # follow the summary log
nightshift ui                       # interactive TUI — watch live AND message the agent
nightshift status                   # alive? what task?
nightshift review                   # morning report
nightshift resume                   # diagnose + restart after a stop
nightshift bulletproof-init         # bootstrap a Bulletproof PR sweep
nightshift help                     # all commands

The CLI dispatches to the bundled bash scripts and templates. Resolves install root regardless of channel (npm global, Homebrew Cellar, tarball, git clone, or ~/.claude/skills/).

Slash commands

After install, six commands trigger directly inside Claude Code:

Or talk naturally. The skill triggers on phrases like “set up a nightshift”, “review last night's run”, “is my agent still running”.

Cost and safety

[!WARNING] Nightshift spends real money on Claude API calls and modifies your codebase autonomously. Read docs/07-cost-and-safety.md before launching your first run.

Before launch:

Commit any work you care about (the agent edits files)
Set MAX_ITERATIONS — this cap is your worst-case cost cap
Protect main via branch protection
Set spending limits in Anthropic Console
Review the diff in the morning before committing the agent's work

Two modes

Both share the validation pipeline: prettier → tsc → eslint → tests → (optional) Chrome browser test.

Quickstart (with the skill)

After install:

Set up a nightshift to add a stop button to my chat UI, plus branch navigation, plus a media gallery page.

Claude reads your package.json, detects your stack, writes todo-YYYY_MM_DD_chat-features.md with three properly-spec'd tasks, generates the codebase context by exploring src/, copies and customizes run-agent-loop.sh, runs preflight, and tells you the exact launch command.

You run:

./start-nightshift.sh start

Next morning:

Show me what happened overnight.

Claude reads .agent-logs/nightshift-summary.log, greps the todo for REVIEW flags, summarizes the diff by area, shows screenshots for any CHROME REVIEW NEEDED tasks, and produces a structured report with suggested commits.

Tuning

Formula: MAX_ITERATIONS >= NUM_TASKS × (1 + MAX_FIX_ATTEMPTS + 2)

Full presets in templates/runner-config.env.

vs alternatives

Niche: you trust the model to write code, want it harnessed by a strict validation gate, and want it detached so you can sleep.

Adapting to other stacks

The validation function is a single bash function. Replace it.

run_full_validation() {
    black . && mypy . && ruff check . && pytest
}

run_full_validation() {
    gofmt -w . && go vet ./... && golangci-lint run && go test ./...
}

run_full_validation() {
    cargo fmt && cargo clippy -- -D warnings && cargo test
}

Per-stack codebase-context examples and conventions in docs/01-playbook.md sections 11 and 13.

Repo layout

SKILL.md                      Claude Code skill manifest + workflow guide
.claude-plugin/plugin.json    Plugin marketplace manifest
bin/install.sh                One-liner installer

commands/
  nightshift-setup.md         /nightshift-setup
  nightshift-status.md        /nightshift-status
  nightshift-review.md        /nightshift-review
  nightshift-resume.md        /nightshift-resume
  nightshift-debug.md         /nightshift-debug
  nightshift-bulletproof.md   /nightshift-bulletproof

docs/
  01-playbook.md              Master guide (47 KB)
  02-bulletproof-mode.md      PR-loop variant deep dive
  03-chrome-testing.md        Live browser MCP testing
  04-qa-checklist.md          Writing production-ship checklists
  05-failure-modes.md         Cheatsheet of seen failures + fixes
  06-test-loop.md             Recursive validation loop (no Claude)
  07-cost-and-safety.md       Cost estimates + safety checklist
  FAQ.md                      Common questions

scripts/
  run-agent-loop.sh           Classic feature-implementation runner
  nightshift-bulletproof.sh   Branch + commit per step + PR variant
  start-nightshift.sh         start/stop/status/tail wrapper
  test-nightshift.sh          Recursive validation loop

templates/
  todo-template.md            Feature-plan skeleton
  codebase-context.md         Heredoc filler for the runner
  qa-checklist-template.md    Production-ship checklist skeleton
  runner-config.env           Tuning presets per scenario

examples/
  todo-simple-example.md      Synthetic 5-task dark-mode toggle (start here)
  bulletproof-steps-example.md Synthetic 10-step hardening sweep
  qa-checklist-saas.md        Real (sanitized) 22-section SaaS checklist
  bulletproof-summary.log     Real (sanitized) 100-step run timeline

.github/workflows/lint.yml    CI: shellcheck + markdownlint + JSON validation
CONTRIBUTING.md               How to contribute
CHANGELOG.md                  Release history
SECURITY.md                   Vulnerability disclosure policy

Prerequisites

Why nightshift

The autonomous loop verifies code compiles, lints, types pass, unit tests pass, and the browser sees the element. It does not verify the user's golden path end-to-end, payment flows, email sends, cross-feature interactions, performance, cross-browser behavior, mobile UX, legal copy, or compliance.

That's the human's job in the morning. The QA checklist (templates/qa-checklist-template.md) is the contract for those things.

Nightshift is not AI replaces engineering. It is AI handles the mechanical 80% so you can spend morning hours on the judgment-heavy 20%.

License

MIT. See LICENSE.

Contributing

See CONTRIBUTING.md. The most valuable contributions: new stack adapters (Elixir, Kotlin, Swift, Ruby, .NET), sanitized real examples from your own runs, failure modes not yet in docs/05-failure-modes.md.

Security

See SECURITY.md for vulnerability disclosure.

Built for the overnight engineer.