@bilalimamoglu/sift
v0.5.1
Published
Local-first output guidance for coding agents working through noisy command output.
Maintainers
Readme
sift
Turn noisy command output into a short, actionable first pass for your coding agent
Local heuristics first. Group repeated failures into likely root causes and next steps before your agent reads the full log.
Get Started
npm install -g @bilalimamoglu/siftBest today on noisy pytest, vitest, jest, tsc, ESLint, common build failures, npm audit, and terraform plan output.
Why Sift?
When an agent hits noisy output, it can eventually make sense of the log wall, but it wastes time and tokens getting there.
sift narrows that output locally first. It groups repeated failures, surfaces likely root causes, and points to the next useful step so your agent starts from signal instead of raw noise.
It is not a generic repo summarizer, not a shell telemetry product, and not a benchmark dashboard. It is a local-first triage layer for noisy command output in coding-agent workflows.
Turn 13,000 lines of test output into 2 root causes.
With sift, the same run becomes:
- Tests did not pass.
- 3 tests failed. 125 errors occurred.
- Shared blocker: 125 errors share the same root cause - a missing test environment variable.
Anchor: tests/conftest.py
Fix: Set the required env var before rerunning DB-isolated tests.
- Contract drift: 3 snapshot tests are out of sync with the current API or model state.
Anchor: tests/contracts/test_feature_manifest_freeze.py
Fix: Regenerate the snapshots if the changes are intentional.
- Decision: stop and act.In one large test-status benchmark fixture, sift compressed 198,026 raw output tokens to 129. That is scoped proof for a noisy test-debugging case, not a promise that every preset behaves the same way.
Quick Start
1. Install
npm install -g @bilalimamoglu/siftRequires Node.js 20+.
2. Try the main workflow
If you are new, start here and ignore hook beta and native surfaces for now:
sift exec --preset test-status -- pytest -qOther common entry points:
sift exec --preset test-status -- npx vitest run
sift exec --preset test-status -- npx jest
sift exec "what changed?" -- git diff3. Zoom only if needed
Think of the workflow like this:
standard= mapfocused= zoom- raw traceback = last resort
sift rerun
sift rerun --remaining --detail focusedIf standard already gives you the likely root cause, anchor, and fix, stop there and act.
Benchmark Results
The output reduction above measures a single command's raw output. The table below measures one replayed end-to-end debug loop: how many tokens, tool calls, and seconds the agent spent to reach the same outcome in that benchmarked scenario.
Real debug loop on a 640-test Python backend with 124 repeated setup errors, 3 contract failures, and 511 passing tests:
| Metric | Without sift | With sift | Reduction | |--------|-------------:|----------:|----------:| | Tokens | 52,944 | 20,049 | 62% fewer | | Tool calls | 40.8 | 12 | 71% fewer | | Wall-clock time | 244s | 85s | 65% faster | | Commands | 15.5 | 6 | 61% fewer | | Outcome | Same | Same | Same outcome |
Same outcome, less agent thrash.
Methodology and caveats: BENCHMARK_NOTES.md
How It Works
sift keeps the explanation simple:
- Capture output. Run the noisy command or accept already-existing piped output.
- Run local heuristics. Detect known failure shapes first so common cases stay cheap and deterministic.
- Return a useful first pass. When heuristics are confident,
siftgives the agent grouped failures, likely root causes, and the next step. - Fall back only when needed. If heuristics are not enough,
siftuses a cheaper model instead of spending your main agent budget.
Your agent spends tokens fixing, not reading.
Key Features
Test Failure Guidance
Collapse repeated pytest, vitest, and jest failures into grouped issues with likely root causes, anchors, and fix hints.
Typecheck and Lint Guidance
Group noisy tsc and ESLint output into the few issues that actually matter instead of dumping the whole log back into the model.
Build Failure Extraction
Pull out the first concrete error from webpack, esbuild/Vite, Cargo, Go, GCC/Clang, and similar build output.
Audit and Infra Risk
Surface high-impact npm audit findings and destructive terraform plan signals without making the agent read everything.
Heuristic-First by Default
Every built-in preset tries local parsing first. When the heuristic handles the output, no provider call is needed.
Agent and Automation Friendly
Use sift in Codex, Claude, CI, hooks, or shell scripts when you want downstream tooling to receive a short first pass instead of the raw log wall.
Presets
| Preset | What it does | Needs provider? |
|--------|--------------|:---------------:|
| test-status | Groups pytest, vitest, and jest failures into root-cause buckets with anchors and fix suggestions. | No |
| typecheck-summary | Parses tsc output and groups issues by error code. | No |
| lint-failures | Parses ESLint output and groups failures by rule. | No |
| build-failure | Extracts the first concrete build error from common toolchains. | Fallback only |
| contract-drift | Detects explicit snapshot, golden, OpenAPI, manifest, or generated-artifact drift without broadening into generic repo analysis. | Fallback only |
| audit-critical | Pulls high and critical npm audit findings. | No |
| infra-risk | Detects destructive signals in terraform plan. | No |
| diff-summary | Summarizes change sets and likely risks in diff output. | Yes |
| log-errors | Extracts the strongest error signals from noisy logs. | Fallback only |
When output already exists in a pipeline, use pipe mode instead of exec:
pytest -q 2>&1 | sift preset test-status
npm audit 2>&1 | sift preset audit-criticalSetup and Agent Integration
If you want deeper integration after the first successful sift exec run, start with:
sift installMost built-in presets run entirely on local heuristics with no API key required. If you want deeper fallback for ambiguous cases, sift also supports OpenAI-compatible and OpenRouter-compatible endpoints.
During install, pick the mode that matches reality:
agent-escalation:siftgives the first answer, then your agent keeps goingprovider-assisted:siftitself can ask a cheap fallback model when neededlocal-only: keep everything local
Runtime-native files are small guidance surfaces, not a second execution system:
- Codex: managed
AGENTS.mdblock plus a generatedSKILL.md - Claude: managed
CLAUDE.mdblock plus a generated.claude/commands/sift/command pack - Cursor: optional
.cursor/skills/sift/SKILL.mdpath when you want an explicit native Cursor skill
Default rule:
- use
sift execfor the normal first pass - use
sift hookonly as an optional beta shortcut for a tiny known-command set
Optional local evidence surfaces:
sift gain
sift discovergainshows local, metadata-only first-pass historydiscoverstays quiet unless your own local history is strong enough to justify a concrete suggestion
If you want the full install, ownership, and touched-files details, see docs/cli-reference.md. The short version: sift does not write shell rc files, PATH entries, git hooks, or arbitrary repo files during install.
If you want this repo's tracked pre-push verification hook to actually run on your machine, you still need to activate it once:
npm run setup:hooksTest Debugging Workflow
For noisy test failures, start with the test-status preset and let standard be the default stop point.
sift exec --preset test-status -- <test command>
sift rerun
sift rerun --remaining --detail focused
sift rerun --remaining --detail verbose --show-rawUseful rules of thumb:
- If
standardends withDecision: stop and act, go read source and fix the issue. - Use
sift rerunafter a change to refresh the same test command atstandard. - Use
sift rerun --remainingto zoom into what still fails after the first pass. - Treat raw traceback as the last resort, not the starting point.
For machine branching or automation, test-status also supports diagnose JSON:
sift exec --preset test-status --goal diagnose --format json -- pytest -q
sift rerun --goal diagnose --format jsonDiagnose JSON is summary-first on purpose. If read_targets.anchor_kind=traceback and read_targets.context_hint.kind=exact_window, read that narrow range first. If the read target is lower-confidence or search_only, treat it as a representative hint rather than exact root-cause proof.
Limitations
- sift adds the most value when output is long, repetitive, and shaped by a small number of root causes. For short, obvious failures it may not save much.
- The deepest local heuristic coverage is in test debugging (pytest, vitest, jest). Other presets have solid heuristics but less depth.
- sift does not help with interactive or TUI-based commands.
- sift is not a generic repo summarizer or broad mismatch detector. It works best when the command output itself carries strong failure or drift evidence.
- When heuristics cannot explain the output confidently, sift either falls back to a provider or returns the strongest local first pass it can, depending on how you choose to use it.
Docs
- CLI reference: docs/cli-reference.md
- Worked examples: docs/examples
- Benchmark methodology: BENCHMARK_NOTES.md
- Contributing and development notes: CONTRIBUTING.md
- Release notes: release-notes
License
MIT
Local-first output guidance for coding agents.
