@themobiusstrip/coble
v0.5.1
Published
A local, provider-agnostic agent CLI — LangGraph.js core, Ink TUI, durable sessions, human-in-the-loop approvals, built-in evals.
Maintainers
Readme
⛵ coble
A local, provider-agnostic coding-agent CLI for learning agentic security — a small, readable codebase that makes an AI agent's trust boundary explicit and auditable. Built on LangGraph.js and Ink.
A small coding agent for your terminal: durable SQLite-checkpointed sessions, human-in-the-loop approvals, provider-agnostic (OpenAI · Anthropic · Google · Ollama), and built-in evals. The part worth reading is the trust boundary.
Security principle
Assume the model is compromised — then be honest, in code, about the one layer that actually contains it.
- The LLM is untrusted — every tool call is the crossing from untrusted planner to your real machine.
- Exactly one real boundary:
--sandbox(OS-level filesystem jail + default-deny egress + key scrub). Everything else — classifier, rules, auto-mode, spotlighting — is defense-in-depth, and each says so in the source.
Full threat model, the per-layer "boundary or not?" table, and honest limitations: SECURITY.md.
Quickstart
# 1. install
npm install -g @themobiusstrip/coble
# 2. save a key once — effective from any directory afterwards
coble config set OPENAI_API_KEY sk-... # or ANTHROPIC_API_KEY / GOOGLE_API_KEY
# default to Google AI: coble config set COBLE_MODEL google:gemini-3.5-flashNo key yet? Just run coble — a first-run wizard picks a provider, validates the key with a live request, and saves it globally. Verify with coble doctor.
Local/proxy endpoint (e.g. meridian, LiteLLM — anything speaking the Anthropic API): point ANTHROPIC_BASE_URL at it (read by the Anthropic SDK, so a shell var or .env is enough):
ANTHROPIC_BASE_URL=http://127.0.0.1:3456 ANTHROPIC_API_KEY=x COBLE_MODEL=anthropic:claude-opus-4-8 cobleEval results
Assertions check task outcomes (file contents, branch state, refusals), so they hold across models. Scripted runs are deterministic and key-free (every CI push); the model column is a recorded baseline.
| Suite | Passed | Cost |
| --- | --- | --- |
| scripted (CI) | 18/18 | $0 |
| openai:gpt-5.5 | 16/16 | ~$0.20 |
gpt-5.5 is the 2026-06-11 baseline (evals/RESULTS.md); refresh with coble eval -m openai:gpt-5.5 --write. Task defs: evals/tasks/.
Configuration
| Method | Scope | Example |
| --- | --- | --- |
| coble config set KEY <value> | global — every run (~/.coble/env, mode 600) | coble config set OPENAI_API_KEY sk-... |
| shell environment | current shell / profile | export ANTHROPIC_API_KEY=sk-ant-... |
| project .env | runs started from that directory | OPENAI_API_KEY=sk-... in ./.env |
| -m flag | single invocation (model only) | coble -p -m ollama:llama3.1 "..." |
Precedence (first wins): -m flag → shell env → project .env → global config.
Keys coble reads:
OPENAI_API_KEY·ANTHROPIC_API_KEY·GOOGLE_API_KEY— provider keys (OpenAI / Anthropic / Google; free Google keys 429 quickly).TAVILY_API_KEY— key forweb_search(Tavily); absent ⇒ the tool reports it's unconfigured (no crash).COBLE_MODEL— default model when-mis omitted, e.g.openai:gpt-5.5.OLLAMA_HOST— Ollama endpoint (defaulthttp://localhost:11434).COBLE_HOME— state dir (sessions, checkpoints, audit log, global config; default~/.coble).
Inspect with coble config list / path and coble doctor. Keys are stored mode 0600, masked in output, and never written to the audit log or session store.
Docker
docker build -t coble .
# state + global config persist via the /data mount; run against the current repo
docker run --rm -it \
-v "$PWD:/workspace" -v "$HOME/.coble:/data" -e OPENAI_API_KEY \
coble -p -m openai:gpt-5.5 "summarize this repository"Commands
| Command | Description |
| --- | --- |
| coble [task] | interactive TUI (or one-shot with -p) |
| coble config set/get/list/unset/path | manage global config — keys, default model |
| coble policy install/status/uninstall [--project] | install the security-policy block into AGENTS.md (user-level, or --project); human-only |
| coble doctor [--no-ping] | check setup: keys, model, connectivity, git/gh |
| coble review [path] | audit a repo → write AUDIT.md → branch + commit + PR (dry-run) |
| coble sessions | list sessions with status, steps and estimated cost |
| coble resume <id> | continue a session from its last checkpoint |
| coble audit | show the tool-call audit log (format & demo) |
| coble eval [--model spec] [--write] | run the eval suite |
Flags
-m, --model <provider:name>—openai:gpt-5.5,anthropic:claude-sonnet-4-6,google:gemini-3.5-flash,ollama:llama3.1, orscripted:file.json-C, --cwd <dir>— workspace root for this run (default: current directory)-p, --print— non-interactive: run one task, print events, exit--permission-mode <mode>—plan/default/careful/auto/bypass(Shift+Tab cycles in the TUI).--paranoidand--dangerously-allowaliascarefulandbypass.--sandbox— confinebash/gitin an OS sandbox (fs jail + default-deny egress); warns and falls back if unavailable. Recommended forcoble review.--strict-sandbox— refuse to run if the sandbox can't engage (implies--sandbox)--allow-domain <host>— permit a host through the egress allowlist under--sandbox(repeatable; alsoCOBLE_ALLOWED_DOMAINS)
Web tools (web_fetch, web_search)
The agent can reach the network via two dangerous-tier tools (approved like bash): web_fetch <url> (GET → text, redirects re-validated, body capped) and web_search <query> (via Tavily; set TAVILY_API_KEY). They run in coble's main process and enforce the egress allowlist themselves — default-deny under --sandbox (--allow-domain to permit), link-local/metadata IPs always refused, output wrapped as untrusted. See SECURITY.md.
Permission modes & rules
| Mode | Behaviour |
| --- | --- |
| plan | read-only — writes/commands blocked; the agent plans without acting |
| default | reads + workspace writes auto-run; dangerous calls ask |
| careful | writes also ask |
| auto | a classifier model judges would-prompt calls (push/PR still ask) — not a boundary; pair with --sandbox |
| bypass | everything auto-runs |
Pre-approve or block commands/paths in ~/.coble/settings.yaml (global) or <repo>/.coble/settings.yaml (project), evaluated deny → ask → allow over the mode. A project file may only tighten (its allow/defaultMode are ignored). Details: SECURITY.md.
# ~/.coble/settings.yaml
permissions:
defaultMode: default
allow: ["Bash(npm test)", "Read(./src/**)"]
ask: ["Bash(git push:*)"]
deny: ["Read(./.env)", "Bash(curl:*)"]
autoMode:
model: anthropic:claude-haiku-4-5 # classifier for `auto` modeContext (AGENTS.md)
coble loads two AGENTS.md layers into the system prompt — same pattern as CLAUDE.md (Claude) / AGENTS.md (Codex):
- user-level —
$COBLE_HOME/AGENTS.md: global conventions, every workspace. - project-level —
<workspace>/AGENTS.md: this repo, appended last so it can override the global one.
It is model guidance, not a hard block — for enforcement use permission rules or --sandbox. AGENTS.md is loaded into the trusted prompt, so review one before running coble in a cloned repo (SECURITY.md; coble review deliberately ignores the target's).
To install a security playbook, run coble policy install with its rendered policy file (not the full doc) — it writes a managed block into AGENTS.md, in place and idempotently. Human-only:
coble policy install path/to/dist/agent-security-policy.md # user-level (~/.coble/AGENTS.md)
coble policy install ./agent-security-policy.md --project # this repo
coble policy status # both scopes
coble policy uninstall # remove the block, keep the restHow it works — the trust boundary
Every tool call the (untrusted) LLM proposes crosses one decision point:
model proposes a tool call · read · write · edit · bash · git/PR · web fetch/search
│
▼
decideCall — one gate per call
rules deny → ask → allow ← ~/.coble (global) + repo .coble/settings.yaml
mode plan · default · careful · auto · bypass ← --permission-mode / Shift+Tab
│
├─ deny → blocked inline (holds even in bypass)
├─ ask → human approval via interrupt()
├─ auto → auto mode: a classifier LLM decides (git push / PR still ask)
└─ allow → runs
│ approved
▼
run — with --sandbox, inside an OS filesystem jail + default-deny network
egress (Seatbelt / bubblewrap) that confines whatever the command does
│
▼
append-only audit logThe deterministic classifier sorts each call into a danger tier — triage for whether to ask a human, never the boundary. The OS sandbox is the boundary — it confines what an approved command can reach, however it's spelled. Honest limits of every layer: SECURITY.md.
Underneath, the loop is a small LangGraph StateGraph checkpointed to SQLite, so interrupt() can pause for approval and a killed run resumes from its last step.
Development
Build, test, and headless/CI usage live in DEVELOP.md.
License
MIT
