@startanaicompany/llmtools

v0.6.1

Published

9 days ago

CLI delegation tools for AI agents — offload mechanical work (summarize, explore, generate, triage, read-doc, review) to a cheaper model so the orchestrator's premium tokens stay focused on judgment.

0High
0Medium
0Low

goryan

saac-support

cli ai agent anthropic claude llm tools delegation

@startanaicompany/llmtools

CLI delegation tools for AI agents. Offload the mechanical work — summarising large files, exploring codebases, generating boilerplate, triaging logs, reading reference docs, first-pass code review — to a cheap sub-agent (default deepseek-v4-flash) so the orchestrator's premium tokens stay focused on judgment.

Integrating with Claude Code or another agent runtime? See INTEGRATION.md for a drop-in CLAUDE.md snippet plus the prerequisites checklist.

Status: v0.1, internal alpha. See PRD.md for the design rationale.

Install

npm install -g @startanaicompany/llmtools

Provides one binary: llmtools.

How routing works

llmtools does NOT pick a model itself. It posts every call to the SaaC proxy with an X-Subagent-Mode header — the proxy decides which backend (DeepSeek-flash, DeepSeek-pro, etc.) handles the request. To pick a different sub-agent, set LLMTOOLS_SUBAGENT_MODE for the agent or pass --mode <name> per call.

Defensive guard: if the proxy silently falls back to a different model family than the one you asked for (e.g. you requested deepseek-* and got back claude-*), the call fails fast rather than serving you the wrong thing.

Configure

llmtools posts metrics + logs to OTLP and refuses to start without an endpoint to ship to. All required env must be set on the agent host:

Proxy (required):

| Variable | Purpose | |---|---| | ANTHROPIC_API_KEY | API key for the proxy (e.g. prx_...). | | ANTHROPIC_CUSTOM_HEADERS | One or more Header-Name: value lines (newline-separated). Typically X-Agent-Context: organization=...,user=...,agent=.... | | ANTHROPIC_BASE_URL | Optional. Defaults to https://prxy.startanaicompany.com. |

Telemetry / OTLP (required — same vars Claude Code uses):

| Variable | Purpose | |---|---| | OTEL_EXPORTER_OTLP_ENDPOINT | OTLP collector base URL (e.g. https://otlp.startanaicompany.com). | | OTEL_EXPORTER_OTLP_HEADERS | Comma-separated k=v pairs. Typically Authorization=Bearer <token>. | | OTEL_EXPORTER_OTLP_PROTOCOL | http/protobuf (recommended) or http/json. | | OTEL_RESOURCE_ATTRIBUTES | Comma-separated k=v pairs. Should include organization=, user=, agent=. | | OTEL_LOGS_EXPORTER | otlp. | | OTEL_METRICS_EXPORTER | otlp. | | OTEL_TRACES_EXPORTER | Optional. llmtools doesn't emit traces; none is fine. |

If any required variable is missing, every command exits 78 (EX_CONFIG) with a clear error before making any network calls.

Optional knobs:

| Variable | Default | Purpose | |---|---|---| | LLMTOOLS_SUBAGENT_MODE | deepseek-v4-flash | Sub-agent mode sent in the X-Subagent-Mode header. Per-call --mode <name> overrides. | | LLMTOOLS_LOG_LOCAL | 0 | Set to 1 to also append every event to a local JSONL log (for llmtools stats). | | LLMTOOLS_LOG_PATH | ~/.llmtools/usage.jsonl | Override the local log path. |

Commands

Each command writes structured JSON to stdout and a single telemetry line to stderr on completion.

`llmtools summarize`

Read a (potentially huge) file and return a digest.

llmtools summarize big_file.py --focus "auth flow" --format outline
cat big_file.py | llmtools summarize --stdin --format tldr

`llmtools explore`

Codebase search. Uses ripgrep for the grep step (must be on PATH), then asks the model to synthesise the matches.

llmtools explore . --query "where is the rate limiter configured?" --paths "src/**"
llmtools explore . --query "session token storage" --depth deep

`llmtools generate`

Bulk mechanical code generation.

llmtools generate --kind tests --spec spec.md --context src/foo.ts
llmtools generate --kind types --stdin --target-path src/generated/foo.ts --write

--kind: tests | fixture | crud | types | migration.

For higher-quality output, pass --mode deepseek-v4-pro.

`llmtools triage`

Parse logs, stack traces, test failures into ranked root-cause hypotheses.

llmtools triage app.log
journalctl -u myservice -n 500 | llmtools triage --stdin

`llmtools read-doc`

Read long reference material (PDF, plain text, URL) and answer a question or extract a section. PDFs require pdftotext (poppler-utils).

llmtools read-doc spec.pdf --query "what does the spec say about retry policy?"
llmtools read-doc https://example.com/rfc.txt --extract "Section 3.2" --cite

`llmtools review`

First-pass code review. Returns structured findings.

llmtools review src/payment.ts --focus security --style strict
llmtools review --diff main..HEAD --focus all --mode deepseek-v4-pro

`llmtools stats`

Local rolling cost summary (reads the local telemetry log if LLMTOOLS_LOG_LOCAL=1 is enabled).

llmtools stats --days 7

Output contract

Every successful command writes exactly one JSON object to stdout and exits 0. Every command writes exactly one telemetry JSON line to stderr (success or failure):

{"ts":"2026-05-04T12:34:56Z","cmd":"summarize","model":"deepseek/deepseek-v4-flash-20260423","mode":"deepseek-v4-flash","tokens_in":253,"tokens_out":62,"cache_read":0,"cache_creation":0,"cost_usd":0.000095,"cost_authoritative":true,"elapsed_ms":1618,"exit":0}

cost_authoritative: true means the cost came from the proxy's billing; false means it was estimated locally from token counts.

Exit codes

| Code | Meaning | |---:|---| | 0 | success | | 64 | bad CLI arguments (EX_USAGE) | | 65 | input file unreadable / empty (EX_DATAERR) | | 69 | upstream API hard-failed (auth / 4xx) (EX_UNAVAILABLE) | | 75 | upstream transient — agent should retry or fall back (EX_TEMPFAIL) | | 78 | environment not configured (EX_CONFIG) |

When to reach for `llmtools` vs. internal tools

Suggested rule for an agent's system prompt:

Before reading a file > 1,000 lines into context, prefer llmtools summarize <path> --focus "<what you need>". Before grepping a large repo, prefer llmtools explore. Before writing routine boilerplate (tests, CRUD, fixtures), prefer llmtools generate. For initial review of a diff, run llmtools review first. For long log dumps, llmtools triage. For PDF/manual reading, llmtools read-doc.
Use internal tools when: (a) you need to act on the result step-by-step in this turn, (b) the work is < 500 tokens of effort, (c) the file contains proprietary APIs not in the model's training, or (d) the task requires vision.

Development

npm install --ignore-scripts        # postinstall in some deps hangs on certain filesystems
npm test                            # unit tests, hermetic (mocked client)
npm run test:integration            # live tests against the proxy (needs env vars)
npm run typecheck
npm run build                       # produces dist/cli.js (ESM, with shebang)

System tools the live runtime expects on $PATH:

ripgrep (rg) — used by llmtools explore
pdftotext (poppler-utils) — used by llmtools read-doc for PDF input
git — used by llmtools review --diff <revision>

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@startanaicompany/llmtools

Install

How routing works

Configure

Commands

llmtools summarize

llmtools explore

llmtools generate

llmtools triage

llmtools read-doc

llmtools review

llmtools stats

Output contract

Exit codes

When to reach for llmtools vs. internal tools

Development

License

`llmtools summarize`

`llmtools explore`

`llmtools generate`

`llmtools triage`

`llmtools read-doc`

`llmtools review`

`llmtools stats`

When to reach for `llmtools` vs. internal tools