its-magic
v0.1.2-18
Published
its-magic - AI dev team workflow for Cursor.
Downloads
1,035
Readme
its-magic — AI dev team
Happy coding! Build something awesome.
Drop-in template repo that implements a structured its-magic workflow in Cursor: intake -> discovery -> architecture -> sprint plan -> execute -> QA -> release, with pause/resume, decision gates, and persistent artifacts.
Features (what its-magic can do)
- Structured phase workflow with explicit artifacts.
- Artifact-first memory (state in files, not chat only).
- Decision gate + escalation (
decisions/DEC-xxxx.md). - Pause/resume with checkpoints (
handoffs/resume_brief.md). - Automated execute/QA loop with safety caps (optional).
- 3-layer quality chain: AI loop → local validate-and-push → CI auto-fix.
- CI/CD templates driven by
docs/engineering/runbook.md. - Team-friendly local overrides (
scratchpad.local.md). - Optional remote/docker execution and autonomous installs.
- Built-in benchmarks (live, prompted, headless).
- Multiplatform distribution (npm, Chocolatey, Homebrew).
Setup
its-magic is an installer you run once per repo. It copies the AI dev team
workflow files (.cursor/ commands, rules, agents, hooks, skills, plus docs/,
sprints/, handoffs/, etc.) into your project.
Starter artifacts are shipped as clean placeholders (no preloaded sprint/demo
history), so /intake starts from your own idea.
1) Install its-magic (once)
Pick one method:
| Method | Install command |
|--------|----------------|
| npm | npm install -g its-magic |
| npx | npx its-magic --target . --mode missing |
| Chocolatey | choco install its-magic (Admin shell) |
| Homebrew | brew tap USER/tap && brew install its-magic |
2) Apply to a repo
New repo:
mkdir my-project && cd my-project
git init
its-magic --target . --mode missing --createExisting repo (safe merge):
its-magic --target . --mode missingExisting repo (overwrite + backup):
its-magic --target . --mode overwrite --backupUpgrading an existing repo
When you update its-magic to a newer version (npm update -g its-magic), run
upgrade mode to update framework files while preserving your project data:
its-magic --target . --mode upgradeWhat upgrade does:
- Framework files (commands, rules, agents, hooks, skills, CI, scripts) are updated to the latest version.
- User data (docs, sprints, handoffs, decisions, runbook) is never touched.
- Mixed files (
.cursor/scratchpad.md,README.md) are preserved. If the template version has new content, a review notice is printed. - A
.its-magic-versionfile tracks the installed version in your repo.
Upgrade with backup (backs up framework files before updating):
its-magic --target . --mode upgrade --backup3) Open in Cursor
- Open the project folder
- Run
/intakewith your idea - Follow the workflow
CLI quick commands
# Show banner + help
its-magic
# Show version only
its-magic --version
# Install workflow files into current repo
its-magic --target . --mode missing
# Clean previously installed workflow artifacts
its-magic --clean-repo --target .Installer options
Install options
| Flag | Description |
|------|-------------|
| --target <path> | Path to the repository where workflow files are installed. If omitted you are prompted interactively. |
| --mode missing | Default. Only copy files that do not exist yet. Safe for repos that already have some workflow files. |
| --mode overwrite | Replace every file, even if it already exists. Combine with --backup to keep a snapshot first. |
| --mode interactive | Ask per file whether to overwrite or skip. Useful when you want to cherry-pick updates. |
| --mode upgrade | Update framework files (commands, rules, agents, hooks, skills, CI, scripts) while preserving user data (docs, sprints, handoffs, decisions). Use after updating its-magic to a newer version. |
| --backup | Before overwriting, save existing files to backups/<timestamp>/. Ignored in missing mode (nothing gets replaced). |
| --create | Create the target directory if it does not exist. |
Clean options
| Flag | Description |
|------|-------------|
| --clean-repo | Remove all its-magic workflow artifacts from the target repo (.cursor, docs/product, docs/engineering, sprints, handoffs, decisions). Your own source code is never touched. |
| --yes | Skip the confirmation prompt when cleaning. |
Info
| Flag | Description |
|------|-------------|
| --help, -h | Show banner, version, repo URL, and full usage reference. |
| --version, -v | Print the installed its-magic version and exit. |
How-to
Command usage pattern
- Best practice: use
/<command>+ 1-3 lines context. - For quick ops (
/pause,/resume,/refresh-context) command-only is fine.
What gets installed
your-project/
.cursor/commands/ Cursor slash commands
.cursor/rules/ AI behavior rules
.cursor/agents/ Subagent definitions
.cursor/skills/ Reusable skills
.cursor/hooks/ Automation hooks
.cursor/scratchpad.md Shared configuration flags
.cursor/scratchpad.local.example.md
docs/ Engineering & product docs, runbook
sprints/ Sprint tracking artifacts
handoffs/ Phase handoff artifacts
decisions/ Decision records
scripts/validate-and-push.ps1 Local test-fix-push loop (Windows)
scripts/validate-and-push.sh Local test-fix-push loop (Linux/Mac)
.github/workflows/ CI with auto-fix loop
README.mdTeam mode local overrides (recommended)
Use two layers:
- Shared defaults:
.cursor/scratchpad.md(committed) - Personal overrides:
.cursor/scratchpad.local.md(gitignored)
Setup:
- Copy
.cursor/scratchpad.local.example.mdto.cursor/scratchpad.local.md - Set personal values there (
TEAM_MEMBER,ACTIVE_TASK_IDS, automation style) - Hook merges shared + local (local wins)
Workflow
Core commands
/ask: ask questions using project context (read-only, no artifacts created)./intake: capture idea, backlog, acceptance./discovery: collect UX/product references./research: risks, patterns, dependencies./architecture: technical approach and decisions./sprint-plan: sprint and task list./plan-verify: acceptance coverage check./execute: implement tasks./qa: test and report findings./verify-work: UAT./release: release notes + runbook updates./memory-audit: read-only memory drift check with advisory report./pause,/resume,/refresh-context./auto: orchestration mode that spawns a fresh subagent per phase.
Release notes model (US-0040)
Release history is sprint-scoped and queue-backed:
- Canonical sprint notes:
handoffs/releases/Sxxxx-release-notes.md - Canonical queue tracker:
handoffs/release_queue.md - Legacy compatibility pointer:
handoffs/release_notes.md
Deterministic release semantics:
- Only target sprint artifacts/queue row may be mutated during one
/releaserun. - Entering release flow sets target row to
unreleased. - Successful finalization transitions same row to
released. - Unresolved sprint identity or queue/notes mismatch fails closed with reason codes and remediation guidance; no destructive reconciliation by default.
Agent isolation model
- Every phase command runs in a fresh agent/subagent context.
- Handoff files are the only cross-phase memory (
handoffs/*.md+ artifact files). - Never rely on "ignore prior chat"; use a new context boundary instead.
/autois orchestration only: it calls phase subagents and transfers context through artifacts.
Lightweight interaction
Use /ask when you want to query the project without triggering the workflow:
- "What's the current sprint status?"
- "Which stories are still open?"
- "How does the upgrade mode work?"
- "What decision was made about X?"
/ask reads the project artifacts (state, backlog, architecture, decisions, sprint
progress) and answers from them. It never creates or modifies files. If your question
reveals a bug or feature idea, it will suggest running /intake.
Memory drift auditing
Use /memory-audit to check whether project memory artifacts still match
repository reality. This is a read-only, non-blocking command that produces an
advisory report at docs/engineering/memory-drift-report.md.
When to run:
- Pre-handoff: before writing any role handoff artifact.
- Pre-QA: before
/qaor/verify-work. - Pre-release: before
/release. - Ad-hoc: after external code changes, long pauses, or whenever artifacts feel stale.
How to interpret output:
The report contains a severity summary (high / medium / low) and a
findings table with concrete evidence for each inconsistency. High-severity
findings should be resolved before the next handoff or release. Medium and low
findings can be addressed during /refresh-context or the next sprint.
The report also includes a reference-only "Template drift" section. Template
drift remediation belongs to US-0017 — /memory-audit only flags it for
awareness.
Follow-up commands:
/refresh-context— update stale artifacts./sprint-plan— if new work is discovered./verify-work— if acceptance status needs re-validation./intake— if findings reveal a new story or bug.
Workflow diagrams
flowchart TD
Intake[/intake/] --> Discovery[/discovery/]
Discovery --> Research[/research/]
Research --> Architecture[/architecture/]
Architecture --> SprintPlan[/sprint-plan/]
SprintPlan --> PlanVerify[/plan-verify/]
PlanVerify --> Execute[/execute/]
Execute --> QA[/qa/]
QA -->|fixes needed| Execute
QA --> VerifyWork[/verify-work/]
VerifyWork --> Release[/release/]
Release --> Refresh[/refresh-context/]
Execute --> Pause[/pause/]
QA --> Pause
Release --> Pause
Pause --> Resume[/resume/]
Resume --> Executeflowchart TD
Start[Idea] --> Intake2[/intake/]
Intake2 -->|DecisionGate| Decision{DecisionRequired}
Decision -->|ChooseOption| Discovery2[/discovery/]
Decision -->|ChooseOption| Research2[/research/]
Decision -->|ChooseOption| Architecture2[/architecture/]
Decision -->|ChooseOption| SprintPlan2[/sprint-plan/]
Decision -->|ChooseOption| PlanVerify2[/plan-verify/]
Decision -->|ChooseOption| Execute2[/execute/]
Decision -->|ChooseOption| QA2[/qa/]
Decision -->|ChooseOption| VerifyWork2[/verify-work/]
Decision -->|ChooseOption| Release2[/release/]
Decision -->|ChooseOption| Refresh2[/refresh-context/]
Decision -->|ChooseOption| Pause2[/pause/]
Discovery2 --> Research2
Research2 --> Architecture2
Architecture2 --> SprintPlan2
SprintPlan2 --> PlanVerify2
PlanVerify2 --> Execute2
Execute2 --> QA2
QA2 -->|fixes needed| Execute2
QA2 --> VerifyWork2
VerifyWork2 --> Release2
Release2 --> Refresh2
Execute2 --> Pause2
QA2 --> Pause2
Release2 --> Pause2
Pause2 --> Resume2[/resume/]
Resume2 --> Execute2
Execute2 --> HandoffDevQA[handoffs/dev_to_qa.md]
QA2 --> HandoffQAD[handoffs/qa_to_dev.md]
Intake2 --> HandoffPOTL[handoffs/po_to_tl.md]
SprintPlan2 --> HandoffTLDev[handoffs/tl_to_dev.md]Automation modes
Configure in .cursor/scratchpad.md:
AUTO_FLOW_MODE=manual|auto_until_decisionmanual: you trigger each phase/command yourself.auto_until_decision:/autocontinues by spawning fresh phase subagents until a decision gate, blocker, or pause boundary.
PHASE_MODE=interactive|autointeractive: agent asks clarifying questions more often.auto: agent minimizes prompts and proceeds with best effort.
PERMISSION_MODE=interactive|autointeractive: ask before routine actions.auto: reduce routine permission prompts.
RUN_TESTS_ON_EDIT=0|11: runs configured tests after meaningful edits.0: tests only when you explicitly run QA/test phases.
LOOP_UNTIL_GREEN=0|11: keep iterating fix -> test until green (bounded).0: run one pass and report failures.
AUTO_IMPLEMENTATION_LOOP=0|11: enables execute -> QA -> execute loop automatically with new Dev/QA subagent instances on each cycle.
AUTO_LOOP_MAX_CYCLES=<n>- safety cap for auto loops (recommended
3-7, default5).
- safety cap for auto loops (recommended
AUTO_PAUSE_REQUEST=0|11: request graceful stop at next safe boundary.
AUTO_PAUSE_POLICY=after_task|after_phaseafter_task: faster stop, more frequent boundaries.after_phase: cleaner checkpoints, fewer interruptions.
Sync policy (US-0038)
Phase-triggered sync is policy-controlled and safe by default.
Scratchpad controls:
SYNC_POLICY_MODE=disabled|manual|by_phase|by_milestone|custom_phase_listSYNC_CUSTOM_PHASES=<comma-separated canonical phases>ALLOW_AUTO_PUSH=0|1AUTO_PUSH_BRANCH_ALLOWLIST=<comma-separated branches/patterns>
Default-safe behavior:
- Default mode is
manualwithALLOW_AUTO_PUSH=0(no automatic push). disabledandmanualadd near-zero overhead and preserve manual workflows.- Sync policy is evaluated only at completed phase boundaries.
Guarded auto-push conditions (all must pass):
- Boundary matches configured mode.
- Auto-push is explicitly enabled (
ALLOW_AUTO_PUSH=1). - QA-first safety holds (feature work cannot auto-push pre-QA).
- No unresolved blocking QA findings/critical issues.
- Branch safety holds (protected/default branches denied unless allowlisted).
- Check chain passes (
TEST_COMMANDrequired; optional lint/typecheck only if configured).
Deterministic reason codes include:
SYNC_DISABLED, MANUAL_MODE_NO_AUTO, PRE_QA_AUTOPUSH_FORBIDDEN,
BLOCKING_QA_FINDINGS, BRANCH_NOT_ALLOWLISTED, TEST_COMMAND_MISSING,
TEST_FAILED, TEST_TIMEOUT, OPTIONAL_CHECK_FAILED, SYNC_PUSHED.
Full scratchpad reference (detailed)
MAGIC_CONTEXT_STRICT=0|11: enforces context refresh discipline after code edits.
DONE=0|11: stop hook reminder loops when session is complete.
MAGIC_BENCH_SESSION=<id>- enables live benchmark event logging under one session id.
AUTO_INSTALL_DEPS=0|11: agent may install dependencies/runtimes automatically.
AUTO_RELEASE_NOTES=0|11: auto-generatehandoffs/release_notes.md.
REMOTE_EXECUTION=0|11: allow remote/docker execution if configured.
REMOTE_CONFIG=.cursor/remote.json- path to remote execution server config.
Remote execution config (.cursor/remote.json)
Remote config is optional and mode-aware:
REMOTE_EXECUTION=0(default): skip remote config checks entirely.REMOTE_EXECUTION=1: validate.cursor/remote.jsonfirst and fail fast on missing/malformed/invalid or insecure config.
Canonical contract (DEC-0016):
- Required root fields:
version(integer)defaultTarget(string)targets(array)
- Required target fields:
id(string)type(docker|ssh|vm)enabled(boolean)host(string)port(integer1..65535)workspaceRoot(string)
- Optional:
auth.mode(none|env)- If
auth.mode=env, use env-var references only (tokenEnv,passwordEnv,privateKeyPathEnv, ...).
Two safe target examples are shipped in:
.cursor/remote.json(active repo)template/.cursor/remote.json(template parity copy)
The examples include:
local-docker: local network/docker-like endpoint.remote-vm-ssh: remote VM/SSH-like endpoint.
No secrets policy:
- Never commit inline tokens/passwords/private keys in
remote.json. - Commit env-var reference names only.
Fail-fast error format:
[REMOTE_CONFIG_ERROR] <path>: expected <rule>, got <actual>. Fix: <hint>.
Troubleshooting quick guide:
- Missing file in remote mode:
- Create
.cursor/remote.jsonfrom the template copy, or setREMOTE_EXECUTION=0.
- Create
- Invalid enum/type/range:
- Update the failing field to match allowed values/ranges.
- Malformed JSON:
- Fix JSON syntax and retry.
- Secret-like inline value detected:
- Replace literal secret with an env-var reference field.
Team/local (recommended in .cursor/scratchpad.local.md):
TEAM_MODE=0|1TEAM_MEMBER=<your-id>ACTIVE_TASK_IDS=T-12,T-13
Automated feature loop (optional)
Enable:
AUTO_FLOW_MODE=auto_until_decisionPHASE_MODE=autoPERMISSION_MODE=autoRUN_TESTS_ON_EDIT=1LOOP_UNTIL_GREEN=1AUTO_IMPLEMENTATION_LOOP=1AUTO_LOOP_MAX_CYCLES=5
Then run /auto.
Graceful stop (for shutdown/end of day):
- Set
AUTO_PAUSE_REQUEST=1 - Flow stops at next configured boundary (
AUTO_PAUSE_POLICY) /pauseartifacts are written- Next day run
/resumeor/auto
Recommended profiles
Max automation (high autonomy):
AUTO_FLOW_MODE=auto_until_decisionPHASE_MODE=autoPERMISSION_MODE=autoRUN_TESTS_ON_EDIT=1LOOP_UNTIL_GREEN=1AUTO_IMPLEMENTATION_LOOP=1AUTO_LOOP_MAX_CYCLES=5AUTO_INSTALL_DEPS=1(optional, if you trust auto installs)AUTO_PAUSE_POLICY=after_phase
Safer automation (recommended for most teams):
- same as above, but keep:
PERMISSION_MODE=interactiveAUTO_INSTALL_DEPS=0AUTO_PAUSE_POLICY=after_task
Quality chain (3-layer auto-fix)
its-magic provides a complete quality chain that catches issues at three levels. Each layer catches problems the previous layer missed:
┌─────────────────────────────────────────────────────────────────┐
│ Layer 1: Cursor AI loop (in-editor) OFF by default │
│ AUTO_IMPLEMENTATION_LOOP + LOOP_UNTIL_GREEN │
│ execute → QA → fix → execute (bounded by AUTO_LOOP_MAX_CYCLES)│
└──────────────────────────┬──────────────────────────────────────┘
│ code ready to push
┌──────────────────────────▼──────────────────────────────────────┐
│ Layer 2: validate-and-push (local pre-push) MANUAL (run it)│
│ scripts/validate-and-push.sh / .ps1 │
│ test → format → lint-fix → test → commit + push │
└──────────────────────────┬──────────────────────────────────────┘
│ pushed to GitHub
┌──────────────────────────▼──────────────────────────────────────┐
│ Layer 3: CI auto-fix (GitHub Actions) OFF by default │
│ .github/workflows/ci.yml │
│ test/lint → auto-fix → commit → re-run (up to 3 retries) │
└─────────────────────────────────────────────────────────────────┘| Layer | Default | Enable |
|-------|---------|--------|
| 1 - Cursor AI loop | off | Set AUTO_IMPLEMENTATION_LOOP=1 + LOOP_UNTIL_GREEN=1 in scratchpad |
| 2 - validate-and-push | manual | Run scripts/validate-and-push.sh or .ps1 before pushing |
| 3 - CI auto-fix | off | Set CI_AUTO_FIX: true in docs/engineering/runbook.md |
CI itself (tests, lint, typecheck) always runs on push/PR. Only the auto-fix
retry loop is gated behind CI_AUTO_FIX. When disabled, CI still reports
failures -- it just won't try to fix and commit automatically.
All commands are read from docs/engineering/runbook.md. Fill in your
project-specific commands once and every layer uses them:
TEST_COMMAND: npm test
LINT_COMMAND: npx eslint .
LINT_FIX_COMMAND: npx eslint --fix .
FORMAT_COMMAND: npx prettier --write .
CI_AUTO_FIX: trueLayer 1: Cursor AI loop
Enabled via scratchpad flags (see Automation modes).
The AI runs execute → QA → fix cycles inside Cursor until tests pass or
the safety cap (AUTO_LOOP_MAX_CYCLES) is reached.
Layer 2: Local validate-and-push
Run before pushing to catch anything the AI loop missed:
# Bash (Linux / macOS)
sh scripts/validate-and-push.sh
# PowerShell (Windows)
powershell scripts/validate-and-push.ps1
powershell scripts/validate-and-push.ps1 -MaxAttempts 3The script:
- Runs
FORMAT_COMMANDandLINT_FIX_COMMANDto auto-fix what it can - Runs
LINT_COMMANDandTEST_COMMANDto verify - If checks fail, pauses and waits for you to fix
- Re-runs (up to 5 attempts, configurable)
- When green, commits and pushes automatically
Use -NoCommit (PowerShell) or false as third arg (Bash) to skip auto-push.
Layer 3: CI auto-fix (GitHub Actions)
Disabled by default. Set CI_AUTO_FIX: true in docs/engineering/runbook.md
to enable. When enabled and CI fails after a push, the auto-fix job kicks in:
push / PR ──> checks ──> PASS ──> done
│
FAIL
│
auto-fix job
│
run LINT_FIX_COMMAND
run FORMAT_COMMAND
│
changes found?
╱ ╲
yes no
│ │
commit + push report failure
│ (manual fix needed)
CI re-runs
(up to 3x)Auto-fix commits appear as ci: auto-fix attempt N/3. After 3 retries the
workflow stops and points you to scripts/validate-and-push for local fixing.
Examples
Example 1: New feature from idea
/intake/research/architecture/sprint-plan/plan-verify/execute/qa/verify-work/release/refresh-context
Example 2: Mid-flight idea change
- Set
AUTO_PAUSE_REQUEST=1 - Run
/intaketo update story/acceptance - Re-run
/sprint-plan+/plan-verify - Resume via
/auto
Example 3: Pause/resume
/pause- Close work
/resumenext session
Deterministic /auto continuation
When resuming mid-process, /auto resolves start phase deterministically:
- explicit
/auto start-from=<phase> handoffs/resume_brief.md- conservative
docs/engineering/state.mdfallback - fail-fast (no guessing)
Canonical phases:
intake, discovery, research, architecture, sprint-plan,
plan-verify, execute, qa, verify-work, release, refresh-context.
Fail-fast message format:
[AUTO_RESUME_ERROR] <code>: <summary>. Source=<source>. Fix: <action>.
Compatibility and safety:
- Manual/interactive workflow stays unchanged unless
/autocontinuation is used. - Existing stop conditions remain enforced (decision gate, missing input, pause request, loop max).
Example 4: Existing project onboarding
/map-codebase- Review generated mapping artifacts
- Continue with
/intakeor/architecture
Other useful capabilities
Voice input (multilingual)
Voice is an input layer only; it feeds normal slash commands.
- OS dictation
- Cursor voice (if available)
- Local STT tooling
Reliable pattern:
- bind
/intakeinsertion shortcut - dictate only the content after the command
Repository layout (quick orientation)
.cursor/: commands, rules, agents, hooks, skills, scratchpad.docs/: product + engineering docs.sprints/: sprint planning/tracking.handoffs/: role-to-role transfers.decisions/: decision records..github/workflows/: CI/CD templates.
Developer and release deep-dive
CI/CD via runbook
Workflows read keys from docs/engineering/runbook.md:
TEST_COMMANDLINT_COMMANDTYPECHECK_COMMANDDEPLOY_STAGING_COMMANDDEPLOY_PROD_COMMAND
Unset keys are skipped. The template ships with empty values for LINT_COMMAND,
FORMAT_COMMAND, and TYPECHECK_COMMAND -- this is intentional. its-magic is a
template/installer project; fill in your project-specific commands after setup.
Installer internals
installer.ps1(Windows)installer.sh(macOS/Linux)installer.py(fallback)
Modes: missing, overwrite, interactive, upgrade (+ optional backup).
Release automation
Unified release scripts:
- Windows:
scripts/release-all.ps1 - macOS/Linux:
scripts/release-all.sh
NPM helpers:
npm run release:allnpm run release:all:patch|minor|major|beta|drynpm run release:npm-only|choco-only|brew-only
Release script flow:
- bump
package.jsonversion - publish npm
- create GitHub release
- update/publish Chocolatey package
- update/push Homebrew formula (stable or beta)
flowchart LR
ReleaseAll[scripts/release-all.*] --> VerCheck{version has -?}
VerCheck -->|stable| NPM["npm publish --tag latest"]
VerCheck -->|prerelease| NPMBeta["npm publish --tag beta"]
ReleaseAll --> GH["gh release create"]
VerCheck -->|prerelease| GHPre["--prerelease flag"]
GH --> Choco[choco pack + push]
GH --> BrewCheck{prerelease?}
BrewCheck -->|no| BrewStable[its-magic.rb]
BrewCheck -->|yes| BrewBeta[its-magic-beta.rb]
NPM --> U1["npx its-magic"]
NPMBeta --> U1b["npx its-magic@beta"]
Choco --> U2["choco install its-magic"]
Choco --> U2b["choco install its-magic --pre"]
BrewStable --> U3["brew install its-magic"]
BrewBeta --> U3b["brew install its-magic-beta"]Prereqs:
npm logingh auth login- Chocolatey API key (if choco publish)
- Homebrew tap repo for formula distribution
Package manager installation matrix
| Manager | Stable | Beta / Pre-release |
|------------|-------------------------------------------|---------------------------------------------|
| npm/npx | npx its-magic --target . --mode missing | npx its-magic@beta --target . --mode missing |
| Chocolatey | choco install its-magic | choco install its-magic --pre |
| Homebrew | brew install USER/tap/its-magic | brew install USER/tap/its-magic-beta |
Release package contents
Published npm package includes runtime content only (commands/rules/agents/docs/installers).
Excluded from npm package:
benchmarks/tests/packaging/Plan.md
Benchmarks
- Main benchmark:
benchmarks/run-bench.ps1orbenchmarks/run-bench.sh - Live benchmark:
benchmarks/live/run-live-bench.* - Prompted benchmark:
benchmarks/prompts/run-prompts.* - Headless benchmark:
benchmarks/headless/run-headless.*
Reports:
benchmarks/bench-report.mdbenchmarks/live/live-bench-report.mdbenchmarks/headless/headless-report.mdbenchmarks/headless/protocol.md
flowchart TD
StartBench[Start benchmark] --> SelectScenario[Load scenarios]
SelectScenario --> InstallKit[Install its-magic into temp workspace]
InstallKit --> RunChecks[Validate required files/sections]
RunChecks --> BenchReport[Write benchmarks/bench-report.md]flowchart TD
StartLive[Start live benchmark] --> SetSession[Set MAGIC_BENCH_SESSION]
SetSession --> RunCommands[Run /* commands in Cursor]
RunCommands --> LogHooks[Hook telemetry to bench-log.jsonl]
LogHooks --> LiveReport[Write live-bench-report.md]flowchart TD
StartHeadless[Start headless run] --> LoadPrompt[Load prompt blocks]
LoadPrompt --> TempWorkspace[Create temp workspace]
TempWorkspace --> InstallHeadless[Install its-magic]
InstallHeadless --> RunAgent[agent -p --force for each step]
RunAgent --> ValidateOutputs[Validate files/sections/smoke checks]
ValidateOutputs --> WriteProtocol[Write protocol.md]
WriteProtocol --> WriteHeadlessReport[Write headless-report.md]Rules
core.mdc: phase flow, context pack, pause/resume, remote usage.quality.mdc: small steps, tests/quality, optional auto-install.coding-standards.mdc: strict language best practices and code quality rules.handoffs.mdc: handoffs + state updates required.escalation.mdc: decision gate and stop conditions.
Hooks
beforeShellExecution: blocks dangerous commands.beforeReadFile: warns on secret-like files.afterFileEdit: tracks code edits vs context refresh.stop: reminds context refresh when needed.
Artifacts (single source of truth)
docs/product/*: vision, backlog, acceptance.docs/engineering/*: architecture, decisions, state, runbook.sprints/Sxxxx/*: sprint scope, tasks, progress, QA findings, summary.decisions/*: decision records.handoffs/*: role-to-role transfer notes.
