@startanaicompany/llmtools
v0.6.1
Published
CLI delegation tools for AI agents — offload mechanical work (summarize, explore, generate, triage, read-doc, review) to a cheaper model so the orchestrator's premium tokens stay focused on judgment.
Readme
@startanaicompany/llmtools
CLI delegation tools for AI agents. Offload the mechanical work — summarising large files, exploring codebases, generating boilerplate, triaging logs, reading reference docs, first-pass code review — to a cheap sub-agent (default deepseek-v4-flash) so the orchestrator's premium tokens stay focused on judgment.
Integrating with Claude Code or another agent runtime? See INTEGRATION.md for a drop-in
CLAUDE.mdsnippet plus the prerequisites checklist.
Status: v0.1, internal alpha. See
PRD.mdfor the design rationale.
Install
npm install -g @startanaicompany/llmtoolsProvides one binary: llmtools.
How routing works
llmtools does NOT pick a model itself. It posts every call to the SaaC proxy with an X-Subagent-Mode header — the proxy decides which backend (DeepSeek-flash, DeepSeek-pro, etc.) handles the request. To pick a different sub-agent, set LLMTOOLS_SUBAGENT_MODE for the agent or pass --mode <name> per call.
Defensive guard: if the proxy silently falls back to a different model family than the one you asked for (e.g. you requested deepseek-* and got back claude-*), the call fails fast rather than serving you the wrong thing.
Configure
llmtools posts metrics + logs to OTLP and refuses to start without an endpoint to ship to. All required env must be set on the agent host:
Proxy (required):
| Variable | Purpose |
|---|---|
| ANTHROPIC_API_KEY | API key for the proxy (e.g. prx_...). |
| ANTHROPIC_CUSTOM_HEADERS | One or more Header-Name: value lines (newline-separated). Typically X-Agent-Context: organization=...,user=...,agent=.... |
| ANTHROPIC_BASE_URL | Optional. Defaults to https://prxy.startanaicompany.com. |
Telemetry / OTLP (required — same vars Claude Code uses):
| Variable | Purpose |
|---|---|
| OTEL_EXPORTER_OTLP_ENDPOINT | OTLP collector base URL (e.g. https://otlp.startanaicompany.com). |
| OTEL_EXPORTER_OTLP_HEADERS | Comma-separated k=v pairs. Typically Authorization=Bearer <token>. |
| OTEL_EXPORTER_OTLP_PROTOCOL | http/protobuf (recommended) or http/json. |
| OTEL_RESOURCE_ATTRIBUTES | Comma-separated k=v pairs. Should include organization=, user=, agent=. |
| OTEL_LOGS_EXPORTER | otlp. |
| OTEL_METRICS_EXPORTER | otlp. |
| OTEL_TRACES_EXPORTER | Optional. llmtools doesn't emit traces; none is fine. |
If any required variable is missing, every command exits 78 (EX_CONFIG) with a clear error before making any network calls.
Optional knobs:
| Variable | Default | Purpose |
|---|---|---|
| LLMTOOLS_SUBAGENT_MODE | deepseek-v4-flash | Sub-agent mode sent in the X-Subagent-Mode header. Per-call --mode <name> overrides. |
| LLMTOOLS_LOG_LOCAL | 0 | Set to 1 to also append every event to a local JSONL log (for llmtools stats). |
| LLMTOOLS_LOG_PATH | ~/.llmtools/usage.jsonl | Override the local log path. |
Commands
Each command writes structured JSON to stdout and a single telemetry line to stderr on completion.
llmtools summarize
Read a (potentially huge) file and return a digest.
llmtools summarize big_file.py --focus "auth flow" --format outline
cat big_file.py | llmtools summarize --stdin --format tldrllmtools explore
Codebase search. Uses ripgrep for the grep step (must be on PATH), then asks the model to synthesise the matches.
llmtools explore . --query "where is the rate limiter configured?" --paths "src/**"
llmtools explore . --query "session token storage" --depth deepllmtools generate
Bulk mechanical code generation.
llmtools generate --kind tests --spec spec.md --context src/foo.ts
llmtools generate --kind types --stdin --target-path src/generated/foo.ts --write--kind: tests | fixture | crud | types | migration.
For higher-quality output, pass --mode deepseek-v4-pro.
llmtools triage
Parse logs, stack traces, test failures into ranked root-cause hypotheses.
llmtools triage app.log
journalctl -u myservice -n 500 | llmtools triage --stdinllmtools read-doc
Read long reference material (PDF, plain text, URL) and answer a question or extract a section. PDFs require pdftotext (poppler-utils).
llmtools read-doc spec.pdf --query "what does the spec say about retry policy?"
llmtools read-doc https://example.com/rfc.txt --extract "Section 3.2" --citellmtools review
First-pass code review. Returns structured findings.
llmtools review src/payment.ts --focus security --style strict
llmtools review --diff main..HEAD --focus all --mode deepseek-v4-prollmtools stats
Local rolling cost summary (reads the local telemetry log if LLMTOOLS_LOG_LOCAL=1 is enabled).
llmtools stats --days 7Output contract
Every successful command writes exactly one JSON object to stdout and exits 0. Every command writes exactly one telemetry JSON line to stderr (success or failure):
{"ts":"2026-05-04T12:34:56Z","cmd":"summarize","model":"deepseek/deepseek-v4-flash-20260423","mode":"deepseek-v4-flash","tokens_in":253,"tokens_out":62,"cache_read":0,"cache_creation":0,"cost_usd":0.000095,"cost_authoritative":true,"elapsed_ms":1618,"exit":0}cost_authoritative: true means the cost came from the proxy's billing; false means it was estimated locally from token counts.
Exit codes
| Code | Meaning |
|---:|---|
| 0 | success |
| 64 | bad CLI arguments (EX_USAGE) |
| 65 | input file unreadable / empty (EX_DATAERR) |
| 69 | upstream API hard-failed (auth / 4xx) (EX_UNAVAILABLE) |
| 75 | upstream transient — agent should retry or fall back (EX_TEMPFAIL) |
| 78 | environment not configured (EX_CONFIG) |
When to reach for llmtools vs. internal tools
Suggested rule for an agent's system prompt:
Before reading a file > 1,000 lines into context, prefer
llmtools summarize <path> --focus "<what you need>". Before grepping a large repo, preferllmtools explore. Before writing routine boilerplate (tests, CRUD, fixtures), preferllmtools generate. For initial review of a diff, runllmtools reviewfirst. For long log dumps,llmtools triage. For PDF/manual reading,llmtools read-doc.Use internal tools when: (a) you need to act on the result step-by-step in this turn, (b) the work is < 500 tokens of effort, (c) the file contains proprietary APIs not in the model's training, or (d) the task requires vision.
Development
npm install --ignore-scripts # postinstall in some deps hangs on certain filesystems
npm test # unit tests, hermetic (mocked client)
npm run test:integration # live tests against the proxy (needs env vars)
npm run typecheck
npm run build # produces dist/cli.js (ESM, with shebang)System tools the live runtime expects on $PATH:
ripgrep(rg) — used byllmtools explorepdftotext(poppler-utils) — used byllmtools read-docfor PDF inputgit— used byllmtools review --diff <revision>
License
MIT
