pi-amaze-tools
v0.2.5
Published
pi / senpi extension: structural code search (ast-grep), LSP intelligence (gopls/tsserver/marksman), bounded file/web primitives, and a local-LLM result summarizer (0GM).
Maintainers
Readme
pi-amaze-tools — AMAZE! Zero Gravity!
Languages: English (this file) · 한국어
Inspired by Andy Weir's Project Hail Mary. Dr. Grace Ryland is the explorer who crosses stars; Rocky is the engineer alien who reads structure. AMAZE splits along the same line — one path for the inside of code (ast/lsp), one path for the outside (web search/fetch).
pi-amaze-tools is an extension package for pi-compatible coding agents:
- senpi — primary target, fully verified.
- pi-mono — senpi's upstream; same
extension API, intended to work with no source changes once the host's
packageresolution picks up this repo.
It exposes 18 deterministic tools (filesystem / grep / AST / LSP / web) and post-processes every result through a local 35B LLM (the "0GM summarizer") that preserves all evidence while removing JSON noise — cutting host-LLM context spend without dropping signal.
Why yet another local-LLM tool?
If you own a beefy local machine and have tried any ≤35B model for coding, you've already reached two conclusions:
- Local LLMs cannot beat GPT-5.5 / Opus 4.6 in quality.
- They are slow and hallucinate. Reasoning mode is slow; non-reasoning is dumb.
I reached the same conclusions, then took a different route. Putting a local LLM in the host-agent or sub-agent seat does not replace a paid API. So I moved the local LLM to a different role:
AMAZE uses the local LLM as a "tool-result compressor", NOT a brain.
- Code/web retrieval is done by deterministic tools (ripgrep, ast-grep, LSP, playwright).
- Every raw tool result is summarized by the local 0GM into
i/N:-indexed evidence lines.- Evidence (path, line, col, identifiers, code snippets) is kept verbatim; only duplicates and JSON envelope noise are stripped.
- The actual decisions stay with the host LLM (Claude, GPT-5, Opus, …) reading the compressed result.
Net effect: the local LLM's weaknesses (judgment, hallucination) are sidestepped by giving it only a constrained extraction task, and the host LLM's context spend drops because tool results land at ~50% of raw size.
Architecture
┌──────────────────────────────────────────────────────────────────────┐
│ Host LLM (Claude / GPT-5 / Opus — owns judgment & edits) │
└──────────────────────┬───────────────────────────────────────────────┘
│ tool call (host agent CLI)
▼
┌──────────────────────────────────────────────────────────────────────┐
│ pi-compatible host (senpi / pi-mono) │
│ TypeScript agent framework, jiti-loads ext at session start │
└──────────────────────┬───────────────────────────────────────────────┘
│ 18 tools registered
▼
┌──────────────────────────────────────────────────────────────────────┐
│ pi-amaze-tools (src/index.ts, ~2.1k LOC) │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ Core (8) ls find read_many grep_many changed health │ │
│ │ web_search web_fetch │ │
│ ├────────────────────────────────────────────────────────────────┤ │
│ │ AST (4) ast_search ast_definitions ast_callers │ │
│ │ ast_imports (ast-grep CLI) │ │
│ ├────────────────────────────────────────────────────────────────┤ │
│ │ LSP (6) lsp_definition lsp_references lsp_hover │ │
│ │ lsp_implementation lsp_call_hierarchy │ │
│ │ lsp_status (gopls / tsserver / marksman) │ │
│ └────────────────────┬───────────────────────────────────────────┘ │
│ │ raw tool result │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ Secret redactor (pem, ghp_, sk-ant-, sk-, AKIA, AIza, xox…) │ │
│ └────────────────────┬───────────────────────────────────────────┘ │
│ │ redacted text │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ 0GM summarizer (http://127.0.0.1:8010/v1) │ │
│ │ · N-in-N-out enumeration (`1/22:`, …, `22/22:`) │ │
│ │ · history-free continuation on truncation │ │
│ │ · 2-pass dedupe (by index, then by content) │ │
│ │ · style-template continuation prompt │ │
│ │ · bypass for line-oriented tools (grep_many) │ │
│ └────────────────────┬───────────────────────────────────────────┘ │
└────────────────────────┼─────────────────────────────────────────────┘
│ compact result (~50% of raw, signal kept)
▼
Host LLM (reads, decides, edits)Tool surface (18 tools)
| Group | Tools |
|---|---|
| Core (8) | amaze_health amaze_ls amaze_find amaze_read_many amaze_grep_many amaze_changed amaze_web_search amaze_web_fetch |
| AST (4) | amaze_ast_search amaze_ast_definitions amaze_ast_callers amaze_ast_imports |
| LSP (6) | amaze_lsp_definition amaze_lsp_references amaze_lsp_hover amaze_lsp_implementation amaze_lsp_call_hierarchy amaze_lsp_status |
Every tool returns:
{
content: [{ type: "text", text: "<summary or raw>" }],
details: {
summarized: boolean, // did 0GM compress this?
raw_chars: number,
summary_chars: number,
ratio: number, // summary_chars / raw_chars
redactions: [...], // masked secret-pattern hits
original: <structured data>,
}
}The amaze_* prefix is non-negotiable — it lets the host LLM tell our redacted
- summarized tools apart from the agent's built-in
ls,read,grep, etc.
Installation
Pick one of three install shapes. All three give the host the same 18 tools.
Option A — npm (recommended for end users)
Add the package to your host's settings:
// senpi: ~/.senpi/agent/settings.json
// pi-mono: ~/.pi/agent/settings.json (path may differ)
{
"packages": [
"npm:pi-amaze-tools"
]
}The host fetches the package from npm on the next session start. Cache lives
under ~/.senpi/agent/git/ or ~/.senpi/agent/npm/ depending on source kind.
Option B — git (track a branch directly)
{
"packages": [
"git:github.com/steve-8000/pi-amaze-tools"
]
}Useful when you want to follow main or a specific branch ahead of the next
npm release: git:github.com/steve-8000/pi-amaze-tools@some-branch.
Option C — local clone (for development on the extension itself)
git clone https://github.com/steve-8000/pi-amaze-tools.git ~/src/pi-amaze-tools
cd ~/src/pi-amaze-tools
npm install
npm run check # tsc --noEmit; no build artifact, jiti loads TS directly
# Wire it into senpi as a single-file symlink:
ln -s ~/src/pi-amaze-tools/src/index.ts ~/.senpi/agent/extensions/amaze.tsOr add the local checkout to the host's packages list with a path:
{ "packages": ["/Users/me/src/pi-amaze-tools"] }pi-mono note
pi-mono shares senpi's extension API. Options A–C are intended to work
unchanged. If pi-mono publishes its types under a different npm name and the
import fails, edit one line in src/index.ts:
// senpi:
import { defineTool, type ExtensionAPI } from "@code-yeongyu/senpi";
// pi-mono (hypothetical):
import { defineTool, type ExtensionAPI } from "@badlogic/pi-mono";Please open an issue with the working pi-mono config and we'll fold it back into this README.
Configure the 0GM summarizer
pi-amaze-tools post-processes tool results through a local OpenAI-compatible LLM (the "0GM summarizer"). Two pieces:
1. Run the LLM server
brew install llama.cpp # macOS (Linux: build from source)
LLM_MODEL_PATH=${HOME}/Models/0GM-1.0-35B-A3B-0427.i1-Q4_K_S.gguf
llama-server --host 127.0.0.1 --port 8010 \
-m "${LLM_MODEL_PATH}" --alias 0gm \
-c 32768 -ngl 999 &Keep it alive across reboots with a launchd plist (macOS) or systemd unit
(Linux). If 0GM is unreachable, pi-amaze-tools silently falls back to raw
tool output (details.summarized becomes false).
2. Point the extension at it
The defaults match the command above. Override via environment variables if your endpoint, model, or budget differs:
# In your shell rc (~/.zshrc, ~/.bashrc) or before launching the host
export AMAZE_LLM_BASE_URL="http://127.0.0.1:8010/v1"
export AMAZE_LLM_MODEL="0GM-1.0-35B-A3B-0427.i1-Q4_K_S"
export AMAZE_LLM_COMPRESSION_RATIO="0.5"
export AMAZE_LLM_DISABLE="" # set to "1" to bypass the summarizer entirelyOr change at runtime inside a host session:
/amaze-llm status # current config
/amaze-llm ratio 0.4 # target ratio (soft)
/amaze-llm off | /amaze-llm on
/amaze-llm model <id> # switch model
/amaze-llm base <url> # switch endpoint
/amaze-llm reset # drop runtime overrides, back to env varsFull env-variable reference: docs/environment.md.
3. Optional system dependencies
brew install ast-grep # ast_* tools
brew install gopls marksman # lsp_* (Go, Markdown)
npm install -g typescript typescript-language-server # lsp_* (TS/JS)These are runtime requirements for the matching tools; the extension imports none of them at load time, so missing servers just disable the corresponding tools rather than failing.
Disable conflicting senpi packages
Some senpi packages register tools that overlap with pi-amaze-tools. To make
the host LLM consistently pick the redacted+summarized versions, disable the
extension half of these packages in ~/.senpi/agent/settings.json while
keeping their skills/prompts:
{
"packages": [
{ "source": "git:github.com/code-yeongyu/pi-ast-grep", "extensions": [] },
{ "source": "git:github.com/code-yeongyu/pi-lsp-client", "extensions": [] },
{ "source": "git:github.com/code-yeongyu/pi-webfetch", "extensions": [] },
{ "source": "git:github.com/code-yeongyu/pi-websearch", "extensions": [] }
]
}Host core tools (bash, ls, read, write, edit, grep, find) stay
active — pi-amaze-tools' tools are namespaced amaze_* so they don't collide,
and host-LLM policy (see AGENTS.md) routes structured queries to
amaze_* first.
0G Labs 0GM-1.0-35B-A3B tuning
The summarizer prompt is tuned for 0G Labs
0GM-1.0-35B-A3B-0427 (a Qwen3-35B-A3B fine-tune).
| Area | 0GM-specific tuning |
|---|---|
| N-in-N-out enumeration | Input has N entries → output has N evidence lines, each prefixed 1/N:, 2/N:, …, N/N:. When the model stops short of N, a history-free continuation prompt re-asks for the missing range; output is deduped per index and per content. |
| Style template continuation | The first emitted line is fed back into the continuation prompt as a style template so format stays consistent across the full sequence. |
| Snippet preservation | System prompt explicitly forbids paraphrasing code snippets; long snippets keep first ~20 lines verbatim and end with …. |
| Bypass list | Tools whose output is already canonical line-oriented evidence (amaze_grep_many) skip the LLM round-trip entirely. |
Other OpenAI-compatible endpoints (ollama, vLLM, lm-studio, …) work, but the prompt is calibrated for 0GM's output pattern. Best results come from 0GM + pi-amaze-tools.
Benchmarks
Two questions matter: (1) does the summarizer actually compress? and (2) does the host LLM make better decisions with AMAZE-routed results than with raw tool output? We measured both.
(1) Compression (per-tool, ratio target 0.5)
Each of the 18 tools was called once against this workspace itself. The table compares the raw payload (what the deterministic tool produced) with the host-facing output (what the 0GM summarizer emitted after redaction).
| Tool | raw_chars | output_chars | ratio | notes |
|---|---:|---:|---:|---|
| amaze_health | small | passthrough | n/a | under MIN_CHARS |
| amaze_ls | 2,220 | 859 | 0.37 | 34/34 entries, uniform format |
| amaze_find | 3,558 | 1,747 | 0.49 | 22/22 entries, uniform format |
| amaze_read_many | 7,710 | 5,305 | 0.69 | code body kept verbatim |
| amaze_grep_many | 9,105 | 9,105 | 1.00 | bypassed (line-oriented already) |
| amaze_changed | 7,124 | 1,232 | 0.17 | 34/34 files w/ risk tags |
| amaze_ast_search | 5,031 | 2,928 | 0.58 | 19/19 matches, snippets kept |
| amaze_ast_definitions | small | passthrough | n/a | structured & small |
| amaze_ast_callers | 4,190 | 2,229 | 0.53 | 17/17 callers w/ snippets |
| amaze_ast_imports | 4,720 | 1,955 | 0.41 | uniform format |
| amaze_lsp_* (6 tools) | small / failed | passthrough | n/a | structured & small |
| amaze_web_search | 5,400 | 4,984 | 0.92 | small inflation cost; signal kept |
| amaze_web_fetch | small | passthrough | n/a | under MIN_CHARS |
Aggregate across the summarized tools: ~50% of raw, all evidence
preserved (every i/N: row visible to the host).
(2) Signal preservation — a real case
We ran the same exploration prompt ("inspect the source code") in two sessions:
| | Session A | Session B |
|---|---|---|
| Host LLM | Claude Opus 4.7 | GPT-5.5 |
| Tooling | raw bash, grep, find, tsc | senpi + pi-amaze-tools (18-tool surface) |
| Verdict on the repo | "Source code is in a normal state. ✅" | "NUL char detected at offsets 6131, 6206; cleanup candidate." |
Both sessions encountered the same warning from ripgrep — Binary file
src/index.ts matches. Their reactions diverged:
Session A (Opus 4.7 + raw bash) — silently worked around the warning by
re-running grep with the -a flag, never surfacing it. Concluded "normal."
The actual two raw \x00 bytes in globToRegex's ** sentinel went
unreported, and would have been committed as-is.
Session B (senpi + pi-amaze-tools) — amaze_grep_many returned the
ripgrep warning verbatim ("binary file matches (found \"\0\" byte around
offset 6131)") inside the structured result. The host LLM saw the unusual
marker in compressed output and escalated it to the user as a cleanup
candidate.
Root cause confirmed independently: tr -d -c '\0' < src/index.ts | wc -c
reported 2 NUL bytes at exactly those offsets. The fix was a one-line
swap from raw "\x00" to "" (a PUA sentinel) in the glob-to-regex
helper. file src/index.ts flipped from data (binary) to UTF-8 text.
Takeaway
A stronger host LLM is not enough. When a tool flattens its output into
unstructured text, the host has to spot the signal in a sea of bytes — and
often won't. pi-amaze-tools' gain isn't "smaller text" — it's labeled,
indexed, snippet-preserving text that makes signals like
binary file matches, redactions: [...], optional_missing: [...],
truncated: true impossible to silently lose.
(3) Reproduce
Inside a host session, call any tool with a target workspace and inspect
the response envelope. details.summarized, details.raw_chars,
details.summary_chars, and details.ratio show what the summarizer did to
that specific call. Sweep AMAZE_LLM_COMPRESSION_RATIO (or use
/amaze-llm ratio …) across runs to compare.
Security
- Never commit secrets (API keys, OAuth client secrets, SSH private keys, cloud creds).
- pi-amaze-tools redacts common secret-like patterns from every tool result
before it reaches the 0GM summarizer and before it lands in the host
envelope: PEM blocks, GitHub PATs (
ghp_,gho_, …), Anthropic keys (sk-ant-…), OpenAI keys (sk-…), AWS access key IDs (AKIA…), Google API keys (AIza…), Slack tokens (xox[baprs]-…), bearer tokens, genericclient_secret = "…". amaze_web_fetchrefuses URLs whose host resolves to a private, loopback, link-local, or unspecified address (SSRF guard) unlessallow_private: trueis set explicitly.- Redaction is a guardrail, not permission to paste secrets casually.
Full policy: docs/security.md.
Documentation
AGENTS.md— host-agent operating policydocs/install.md— install steps (host + pi-amaze-tools + 0GM)docs/environment.md— env-var referencedocs/orchestration.md— host ↔ pi-amaze-tools ↔ 0GM flowdocs/security.md— redaction and secret-hygiene rules
License
MIT.
Grace went looking for answers past the starlight; Rocky tried not to lose his crewmate. This project works the same way — one path takes the outside (web), one takes the inside (source), so the host LLM can carry the heavy reasoning without drowning in raw output.
